0
votes

I'm using google app engine and need to have the keys of an entity between 1000 and 2^31. I'm considering 2 ways of doing this:

1) keep a counter of the created keys as detailed here https://cloud.google.com/appengine/articles/sharding_counters. But this requires several datastore read/writes for every key and I'm not sure it is guaranteed to be consistent.

2) generate a random int in my range and check if that key is already in the database. To make it cheap, i'd like a keys_only query, but i can't find a way to do this except saving the key also as a separate field: MyEntity.query(MyEntity.key_field==new_random_number).fetch(keys_only=True)

Is there a better way to achieve this?

2
you can use ndb_get_or_insert to check if your new key was inserted. - voscausa
But that would overwrite the key if it already exists and I don't want that. - user3526468
NO, It will get the entitity if the key exists. - voscausa
Oh, right, good point. This would probably be the best solution, but in my case some fields in MyEntity depend on the key, so I need to know the key before insertion (or do an update, but that adds extra cost). - user3526468

2 Answers

1
votes

How many writes per second are you expecting in production? Both of your proposals are good, but for our application I decided to go with a sharded counter approach. You can also set the id of an entity before you put it to avoid the query altogether:

MyModel(id="foo")

then you can look it up:

MyModel.get_by_id("foo")

Id doesn't have to be a string, it can be a number also:

MyModel(id=123)

If you decide to go with the sharded counter, here's our production-level code which is darn close what you read in that article ;o) Memcache adds the level of consistency we needed to be able to get the right count.

class GeneralShardedCounterConfig(ndb.Model):
    SHARD_KEY_TEMPLATE = 'gen-count-{}-{:d}'
    num_shards = ndb.IntegerProperty(default=200)

    @classmethod
    def all_keys(cls, name):
        config = cls.get_or_insert(name)
        shard_key_strings = [GeneralShardedCounterConfig.SHARD_KEY_TEMPLATE.format(name, index)
                             for index in range(config.num_shards)]
        return [ndb.Key(GeneralShardedCounter, shard_key_string)
                for shard_key_string in shard_key_strings]


class GeneralShardedCounter(BaseModel):
    count = ndb.IntegerProperty(default=0)

    @classmethod
    def get_count(cls, name):
        total = memcache.get(name)
        if total is None:
            total = 0
            all_keys = GeneralShardedCounterConfig.all_keys(name)
            for counter in ndb.get_multi(all_keys):
                if counter is not None:
                    total += counter.count
            memcache.set(name, total, constants.SHORT_MEMCACHE_TTL)
        return total

    @classmethod
    @ndb.transactional(retries=5)
    def increase_shards(cls, name, num_shards):
        config = GeneralShardedCounterConfig.get_or_insert(name)
        if config.num_shards < num_shards:
            config.num_shards = num_shards
            config.put()

    @classmethod
    @ndb.transactional(xg=True)
    def _increment(cls, name, num_shards):
        index = random.randint(0, num_shards - 1)
        shard_key_string = GeneralShardedCounterConfig.SHARD_KEY_TEMPLATE.format(name, index)
        counter = cls.get_by_id(shard_key_string)
        if counter is None:
            counter = cls(id=shard_key_string)
        counter.count += 1
        counter.put()
        # Memcache increment does nothing if the name is not a key in memcache
        memcache.incr(name)

    @classmethod
    def increment(cls, name):
        config = GeneralShardedCounterConfig.get_or_insert(name)
        cls._increment(name, config.num_shards)

    @classmethod
    def _add(cls, name, value, num_shards):
        index = random.randint(0, num_shards - 1)
        shard_key_string = GeneralShardedCounterConfig.SHARD_KEY_TEMPLATE.format(name, index)
        counter = cls.get_by_id(shard_key_string)
        if counter is None:
            counter = cls(id=shard_key_string)
        counter.count += value
        counter.put()
        # Memcache increment does nothing if the name is not a key in memcache
        memcache.incr(name, value)

    @classmethod
    def add(cls, name, value):
        config = GeneralShardedCounterConfig.get_or_insert(name)
        cls._add(name, value, config.num_shards)
0
votes

Example of get_or_insert. Insert 7 unique keys

import webapp2
from google.appengine.ext import ndb
from datetime import datetime
import random
import logging


class Examples(ndb.Model):
    data = ndb.StringProperty()
    modified = ndb.DateTimeProperty(auto_now=True)
    created = ndb.DateTimeProperty()  # NOT auto_now_add HERE !!


class MainHandler(webapp2.RequestHandler):

    def get(self):

        count = 0
        while count < 7:
            random_key = str(random.randrange(1, 9))
            dt_created = datetime.now()
            example = Examples.get_or_insert(random_key, created=dt_created, data='some data for ' + random_key)
            if example.created != dt_created:
                logging.warning('Random key %s not unique' % random_key)
                continue
            count += 1

        self.response.write('Keys inserted')

app = webapp2.WSGIApplication([
    ('/', MainHandler)
], debug=True)