In Google App Engine Datastore, to what extent does using parent keys hurt performace?

Question

I have two models which naturally exist in a parent-child relationship. IDs for the child are unique within the context of a single parent, but not necessarily globally, and whenever I want to query a specific child, I'll have the IDs for both parent and child available.

I can implement this two ways.

Make the datastore key name of each child entity be the string "<parent_id>,<child_id>", and do joins and splits to process the IDs.
Use parent keys.

Option 2 sounds like the obvious winner from a code perspective, but will it hurt performance on writes? If I never use transactions, is there still overhead for concurrent writes to different children of the same parent? Is the datastore smart enough to know that if I do two transactions in the same entity group which can't affect each other, they should both still apply? Or should parent keys be avoided if locking isn't necessary?

Nick Nick · Accepted Answer · 2014-08-01T04:37:35

In terms of the datastore itself, parent/child relationships are conceptual only. That is, the actual entities are not joined in any way.

A key consists of a Parent Key, a Kind and Id. This is the only link between them.

Therefore, there isn't any real impact beyond the ability to do things transactionally. Similarly, siblings have no actual relationship, just a conceptual one.

For example, you can put an entity into the datastore referencing a parent which doesn't actually exist. That is entirely legitimate and oftentimes very useful.

So, the only difference between option 1 and option 2 is that with option 1 you have to do more heavy lifting and cannot take advantage of transactions or strongly consistent queries.

Edit: The points above to do not mention the limitation of 1 write per entity group per second. So to directly answer the original question, using parent keys limits throughput if you want to write to many entities sharing the same parent key within a second outside of a single transaction.

In Google App Engine Datastore, to what extent does using parent keys hurt performace?

3 Answers