2
votes

I've got an Google App Engine entity with over 2 million entries, which takes up about 2GB. According to the datastore statistics, the built-in indexes are 13GB (75 million entries), and the composite indexes are 1GB (4 million entries).

I understand that the size of my composite indexes are related to how many indexes I have defined in my index.yaml file.

However, why are my built-in indexes so much larger that the data itself, and what can I do to reduce the built-in indexes?

1

1 Answers

3
votes

Most of the model properties are indexed by default, refer here: https://developers.google.com/appengine/docs/python/ndb/properties picke, json, localstructured, blob & json properties are not indexed by default. Meaning if you didn't specify indexed=False on any other property it will have the built-in index.

class User(ndb.Model):
    display_name = ndb.StringProperty(indexed=False)  # will not be indexed
    modified = ndb.DateTimePropert(indexed=False)  # will not be indexed

Most of the time you have a lot of these things that you never query for. But right now you can't have composite index on a an un-indexed property which is already reported feature. https://code.google.com/p/googleappengine/issues/detail?id=4231

Then once you added indexed=False and want to remove all existing built-in indexes you'll need to rerun entity.put() on all of the existing entities.