App Engine High Replication Datastore

Question

I'm a total App Engine newbie, and I want to confirm my understanding of the high replication datastore.

The documentation says that entity groups are a "unit of consistency", and that all data is eventually consistent. Along the same lines, it also says "queries across entity groups can be stale".

Can someone provide some examples where queries can be "stale"? Is it saying I could potentially save an entity without any parent (ie. it's own group), then query for it very soon after and not find it? Does it also imply that if I want data to be always 100% up-to-date I need to save them all in the same entity group?

Is the common workaround for this to use memcache to cache entities for a period of time longer than the average time it takes for data to become consistent across all data centers? What's the ballpark latency for that?

Thanks

Nick Johnson Nick Johnson · Accepted Answer · 2011-05-30T10:02:24

Is it saying I could potentially save an entity without any parent (ie. it's own group), then query for it very soon after and not find it?

Correct. Technically, this is the case for the regular Master-Slave datastore, too, as indexes are updated asynchronously, but in practice the window of time in which that could happen is so incredibly small you never see it.

If by "query" you mean "do a get by key", though, that will always return strongly consistent results in either implementation.

Does it also imply that if I want data to be always 100% up-to-date I need to save them all in the same entity group?

You'll need to define what you mean by "100% up-to-date" before it's possible to answer that.

Is the common workaround for this to use memcache to cache entities for a period of time longer than the average time it takes for data to become consistent across all data centers?

No. Memcache is strictly for improving access times; you shouldn't use it in any situation where cache eviction will cause trouble.

Strongly consistent gets are always available to you if you need to guarantee that you're seeing the latest version. Without a concrete example of what you're trying to do, though, it's difficult to provide a recommendation.

App Engine High Replication Datastore

3 Answers