6
votes

I'm a little confused about 'Entity Groups' on the Google App Engine High Replication Datastore (HRD). The Google documentation mentions that HRD only allows 1 write per second per entity group.

What exactly does this mean? Is this 1 write per user-request or 1 write per entity (which I assume is a similar concept to a "table").

For example, if I have a "User" entity and a "Post" table. If "Post" is an ancestor of "User" :

  1. Does this mean that one "User" can create one "Post" per-second
  2. ...or does it mean all writes to the "Post" entity are restricted to 1 write-per-second regardless of the User? (i.e. the system can only save 1 post at at a time regardless of # of users submitting posts)
  3. ...or does it mean a single "User" entity can not create more than 1 "Post" at the same time (even if thousands of other users are created "Post" entities)?

What are my options to mitigate this? Is it reasonable to make both "User" and "Post" root entities? Will this allow me to create multiple "Post" instances outside of the 1 write-per-second restriction? I want to avoid any potential issues if say 1000 users were to create "Post" entries concurrently.

2

2 Answers

8
votes

"entity group" is not like "table." There is nothing that means "table" in the appengine datastore. You should think only in terms of entities and indexes.

You only use entity groups when you want to be able to do operations transactionally. In the case of a blog with "Posts," it probably doesn't matter if you add or remove Posts transactionally, so they do NOT need to be in an entity group.

I have about 15 different kinds of entities in my application, and about 1.5M of them. Every single one is a root entity, even related ones, and I think this is ideal for AppEngine. As far as I can tell, the ONLY purpose for entity groups is to support atomic operations on multiple entities - they are not an organizational tool.

PS: as to your questions about Entity Group restrictions (which I think will be mostly moot for you now): the write limit is per entity, not per request. 1. Entities don't create other entities. 2. If all Posts were in the same entity group, then yes, you could only save 1 per second. 3. If each user were in its own entity group, you could write 1 post in each same group at the same time, as many times per second as you liked. It's just that no single group can be written more than once per second. Yes, I think "User" and "Post" should both be root entities.

1
votes

Using entity groups also caters to making your data within the group highly consistent.

For instance, without entity groups, if you create a post and then quickly navigate to the recent post list, you may not see your new post immediately. For a blog, that may not be a problem.

But if you're doing a task management system... you go to a task detail screen, close the task and this navigates you back to the task list, the task may still show up as opened. That's not acceptable. Here, you'd need entity groups or some other mean to make your tasks list consistent for the current user.

In some data models, it is easy to create entity groups. For instance, making tasks part of a project group would solve the problem assuming you can only display tasks for a single group. If your UI allows for listing tasks from multiple groups, it's harder to find a model that works.