Are ndb cached read ops still counted as datastore read ops for billing purposes?

Question

NDB manages caches for you. There are two caching levels: an in-context cache and a gateway to App Engine's standard caching service, memcache. Both caches are enabled by default for all entity types, but can be configured to suit advanced needs.

My app doesn't make any ndb caching configuration change, so it must be using the defaults - both caching levels enabled.

I'm running some tests on my staging environment (a separate, dedicated GAE project) where I can totally isolate sequences of activity from any spurious external requests.

Each sequence of activity consists of a cascade of push tasks triggering each-other, creating some hundreds of volatile entities, modifying some of them a variable number of times, reading all of them a variable number of times and finally deleting them all.

Only a handful of other already existing entities are accessed during the sequence, all of them persisting after the sequence ends. The number of accesses to the persistent entities should be significantly lower than the number of accesses to the volatile ones.

The vast majority of read operations are entity lookups by key or ids, obtained from keys_only queries or from other related entities.

I'm not using entity ancestry. Most of these tasks perform cross-group transactions and I do see quite often transaction failures/retries due to data contention on a few "hot" entities (I'd estimate some 200-400 of them in this run, hard to count in the Stackdriver log page).

After executing one such ~20 minutes sequence freshly after the daily quota reset, the app's dashboard shows 3x more Cloud Datastore Read Operations (0.03 Millions) than Cloud Datastore Entity Writes (0.01 Millions). The number of volatile entities is indicated by the Cloud Datastore Entity Deletes (0.00089 Millions), if that matters.

The memcache hit rate was 81%, but I don't know if that is for my app's explicit memcache use only or if it includes the ndb's memcache use.

Some earlier similar measurements but not in as clean environment produced similar results, I did this clean one as a verification.

These observations appear to suggest that the entity reads from the cache still count as datastore reads. But here I'm assuming that:

there weren't many memcache evictions during those ~20 min (the app doesn't have a dedicated memcache bucket).
each write op invalidates the cache thus requiring a datastore read to update the cache, but subsequent read ops should come from the cache (until the next write op), which should lead to rather comparable overall read and write counts if cached reads aren't counted
my (fairly complex) app code really does what I'm describing :)

I didn't find anything about this in the docs, so wondering if someone knows if the cached ndb reads indeed count as datastore reads or can point me to flaws in my interpretation or some official documentation on the subject.

Jim Morrison Jim Morrison · Accepted Answer · 2018-05-11T19:36:21

The memcache hit rate includes the hits for entities written by ndb. It does not include the number of in-context cache hits by ndb.

Each write up to datastore invalidates the caches, so the next read will not be cached in memcache or in the in-context cache.

One other thing is that ndb creates a new in-context cache for each transaction, so the in-context cache isn't very effective in the face of transactions.

The quick answer is that a memcache hit for a datastore entity is not charged as a datastore read for billing purposes.

Are ndb cached read ops still counted as datastore read ops for billing purposes?

2 Answers