0
votes

Consider these two cases in app engine datastore design:

  1. A is ancestor of B. We use a transaction to update this entity group.
  2. A and B are both without an ancestor. We use an XG transaction to update both entities.

I can see these advantages in case 2:

  • Just like case 1, it achieves atomicity.
  • If I give A and B the string or integer id, given that they are different kinds, I can lookup one of them when I have the other one. This would be an equivalent for looking up by parent or child in case 1.
  • Because the entities are not in the group, they do not suffer the write throughput limit.

Are the above points correct? When and why should case 1 be used over case 2?

1

1 Answers

2
votes

XG transactions can only span 25 different entity groups, so there is a limit on the number of multiple changes that can be made in a single batch. Additionally using an ancestor allows you to build queries over the data inside of an ancestor group.

Consider the example of a simple microblogging site where users can make posts and view posts from other users. Your home page displays the most recent posts from all users, so doing an eventually consistent query to get this data is ok (we can tolerate some staleness here). However, when a user makes a post, they should always see that post show up in their feed.

If you stored each post in its own EG, your query for a user's posts might look something like this: SELECT * FROM Post WHERE author = $current_user. However, this is eventually consistent. So a user may make a post and then not have it show up in their feed. Instead, we can take advantage of EGs and make each Post be a child of the User who creates it. Then, we can query for a single user's posts with: SELECT * FROM Post WHERE __key__ HAS ANCESTOR KEY('User', $current_user).

In this case, a user would be limited to creating 1 post per second, however this aligns well with a natural boundary -- user's are limited at this point by how long it takes to type a post.