1
votes

I am trying to design my app engine datastore code and have thought of a potential issue and have not been able to find any concrete information in the documentation regarding what would happen in a given situation.

When I go to store an entity which has an ancestor, before storing it I check to make sure that its ancestor exists in one transaction and if it doesn't exist, I create the ancestor. Next I start another transaction where I store the entity, having created it with the ancestor key that was either found or created in the previous step. In testing with a single user or very few users, this is never going to be a problem as concurrent modification chances are at a minimum, however once deployed, my concern is that in the time between the first transaction (creation/retrieval of the ancestor) and the second transaction (adding an entity as a descendant of the ancestor) another user could potentially delete the ancestor.

My initial thinking was to have this all occur as one transaction but in the case where the ancestor did not exist and had to be created, an ancestor query to check if the entity I want to create exists will fail because of the snapshot isolation model of the datastore. However, I am not sure if this is correct.

Does anyone have any knowledge on the matter? If the ancestor was deleted, will the entity commit still work with a parent key that now refers to nothing? Will this recreate the parent so the future checks on it will return the same key? I would test this situation but I am unable to devise a practical way to do so.

2

2 Answers

1
votes

A possible solution could be to try to get directly the ancestor (not Query) and if the object is null, then create both the ancestor and the descendant within a transaction. This would mimic a Cross-Group Transaction (XG Transaction) since these two entities at the time of creation will not belong to the same entity group.

For more information about XG Transactions take a look at:

Hope this helps!

0
votes

I believe the answer is: "yes, it will still be created".

This behavior shouldn't be specific to the Python or Java API. I tried creating an entity where the key had ancestors that didn't exist, and it seemed to work in Google Cloud Datastore.

I'm guessing that the logic behind this is based on the process in which Datastore answers the question: "which entity group does this key belong in?" I'd suppose that the decision is made by checking the "top-level" ancestor, which would mean that it shouldn't matter whether or not a Key with only the top-level ancestor exists.

That is, if Key.from_path('Kind1', 'parent') and Key.from_path('Kind1', 'parent', 'Kind1', 'child') both are to be put in the same entity group (independently from one another), then the order in which they are added is irrelevant, and the existence of the first is irrelevant to the group where the second would live.

Some example code:

from gcloud.datastore import demo
from gcloud.datastore.entity import Entity
from gcloud.datastore.key import Key

dataset = demo.get_dataset()
entity = Entity()
key = Key.from_path('Person', 'parent', 'Person', 'child').dataset(dataset)
entity = entity.key(key)
entity.save()

Note that there is no 'parent' (an entity where kind='Person', key_name='parent' didn't exist).