1
votes

I am developing an application using with the Cloud Datastore Emulator (2.1.0) and the google-cloud-ndb Python library (1.6).

I find that there is an intermittent delay on entities being retrievable via a query.

For example, if I create an entity like this:

my_entity = MyEntity(foo='bar')
my_entity.put()

get_my_entity = MyEntity.query().filter(MyEntity.foo == 'bar').get()
print(get_my_entity.foo)

it will fail itermittently because the get() method returns None.

This only happens on about 1 in 10 calls.

To demonstrate, I've created this script (also available with ready to run docker-compose setup on GitHub):

import random

from google.cloud import ndb
from google.auth.credentials import AnonymousCredentials


client = ndb.Client(
    credentials=AnonymousCredentials(),
    project='local-dev',
)


class SampleModel(ndb.Model):
    """Sample model."""
    some_val = ndb.StringProperty()


for x in range(1, 1000):
    print(f'Attempt {x}')
    with client.context():
        random_text = str(random.randint(0, 9999999999))
        new_model = SampleModel(some_val=random_text)
        new_model.put()

        retrieved_model = SampleModel.query().filter(
            SampleModel.some_val == random_text
        ).get()
        print(f'Model Text: {retrieved_model.some_val}')

What would be the correct way to avoid this intermittent failure? Is there a way to ensure the entity is always available after the put() call?

Update I can confirm that this is only an issue with the datastore emulator. When testing on app engine and a Firestore in Datastore mode, entities are available immediately after calling put().

1
Does it happen if you fetch the entity via its id or key? For example SimpleMOdel.get_by_id(<some-id>) or entity.key().get() (check the syntax, I haven't used the datastore for a while) - snakecharmerb
@snakecharmerb just tested and no the issue doesn't present if using SampleModel.get_by_id(int_id)... This makes me think that the issue is a delay in indexing. I wonder if there is a way to force the indexing to happen? - LondonAppDev
It might be trying to simulate eventual consistency? - snakecharmerb
@snakecharmerb just looked that up and there is a flag --consistency which can be set to 1.0 to turn that off... However the issue still persists. - LondonAppDev
Correction: I didn't apply the change properly... --consistency=1.0 resolves the issue. - LondonAppDev

1 Answers

0
votes

The issue turned out to be related to the emulator trying to replicate eventual consistency.

Unlike relational databases, Datastore does not gaurentee that the data will be available immediately after it's posted. This is because there are often replication and indexing delays.

For things like unit tests, this can be resolved by passing --consistency=1.0 to the datastore start command as documented here.