0
votes

Using: Google App Engine, Python 2.7, Google NDB Datastore, GQL.

I writing a business application that has many NDB kind(ndb.Model) ex: Customers,Vendors,Suppliers,Manufacture,Orders etc..

I like to present the user with information sorted by various Entity/fields:

Examples of: index.yaml - Customer class will have maybe 12-15 index file entries.

- kind: Customers
  - properties:
  - name: Name
  - name: NewDate
  - direction: desc
- kind: Customers
  - properties:
  - name: State
  - name: Name
- kind: Customers
  - properties:
  - name: Country
  - name: Name
- kind: Customer
  - properties:
  - name: Code
  - name: Name
  - name: Class
  - name: LastOrderDate
  - direction: desc

Just an example but I like to have about 20-30 NDB kind(ndb.Model)s with about 15 composite indexes each. I like to make sure I have a workaround if I run into the 200 limit.

Not sure why Google would limit this at 200, I think the limit should be per each NDB kind(ndb.Model) maybe 20 per?

Thank you for any suggestion

1
The new App Engine SDK - Release Notes Version 1.9.0 - February 26, 2014 talks about: "There is no fixed limit on the number of indexes" is this only for Search API or for all custom index for application???Brian
Issue 8305: code.google.com/p/googleappengine/issues/detail?id=8305 is addressing this, as I stated I think the limit should be per each NDB kindBrian

1 Answers

3
votes

You need to think again about your data model. Your main problem is not a limit of 200 indexes. If you implement the model that you are contemplating, your indexes will take many times more storage space than your data, and your write costs will be astronomical.

Every time you save a data object, you will incur writing costs per entity plus per each indexed property plus per each custom index. With your data model you will end up paying 15-20 times more per each update of every entity than is necessary.

You probably have experience with SQL databases, but the datastore is very different. You have to model your data differently.

I built a very complex app with dozens of different entities with very complex relationships between them. I have 5 custom indexes in this app.

EDIT:

There are many approaches to data modeling in non-relational databases. Let's take, for example, your example of "name AND state" index. How many customers are going to have the same name, but in different states? It's much cheaper to retrieve all customers with a given name, and then choose the ones that belong to the required state, rather than create a custom index just for this combination of properties. Remember that reads are much cheaper than writes, plus you have a smaller data size.

The same applies to all of your examples. Instead of "code AND name AND class AND LastOrderDate" you can select all customers by name and order date, and in a very rare case when two customers have the same name, just drop the one with the wrong code/class.