4
votes

I'm a bit confused by some of the GAE documentation. While I intend to add indexes to optimize performance of my application, I wanted to get some clarification on if they are only suggested for this purpose or if they are truly required.

Queries can't find property values that aren't indexed. This includes properties that are marked as not indexed, as well as properties with values of the long text value type (Text) or the long binary value type (Blob).

A query with a filter or sort order on a property will never match an entity whose value for the property is a Text or Blob, or which was written with that property marked as not indexed. Properties with such values behave as if the property is not set with regard to query filters and sort orders.

from http://code.google.com/appengine/docs/java/datastore/queries.html#Introduction_to_Indexes

The first paragraph leads me to believe that you simply cannot sort or filter on unindexed properties. However, the second paragraph makes me think that this limitation is only confined to Text or Blob properties or properties specifically annotated as unindexed.

I'm curious about the distinction because I have some numeric and string fields that I am currently sorting/filtering against in a production environment which are unindexed. These queries are being run in a background task that mostly doesn't care about performance (would rather optimize for size/cost in this sitation). Am I somehow just lucky that these are returning the right data?

4

4 Answers

6
votes

In the GAE datastore, single property indexes are automatically created for all properties that are not unindexable (explicitly marked, or of those types).

The language in that doc, I suppose, is a tad confusing.

You only need to explicitly define indexes when you want to index by more than one property (say, for sorting by two different properties.)

4
votes

In GAE, unfortunately if the property is marked as unindexed

  num = db.IntegerProperty(required=True, indexed=False)

Then it is impossible to include it in the custom index... This is counterproductive (Most built-in indices are never used by my code, but take lots of space). But it is how GAE currently works.

Datastore Indexes - Unindexed properties:

Note: If a property appears in an index composed of multiple properties, then setting it to unindexed will prevent it from being indexed in the composed index.

1
votes

Never add a property to a model without EXPLICITLY entering either indexed=True or indexed=False. Indices take substantial resources: space, write ops costs, and latency increases when doing put()s. We never, never add a property without explicitly stating its indexed value even if the index=False. Saves costly oversights, and forces one to always think a bit about whether or not to index. (You will at some point find yourself cursing the fact that you forgot to override the default=True.) GAE Engineers would do a great service by not allowing this to default to True imho. I would simply not provide a default if I was them. HTH. -stevep

0
votes

you must use index if you want to use two or more filter function in one single query.
e.g:
Foobar.filter('foo =', foo).filter('bar =', bar)

if you just query with one filter, no need to use index, which is auto-generated.

for Blob and Text, you can't generate index for them, even you specify it in index.yaml, meanwhile you can't use filter in them.
e.g.
class Foobar(db.Model):
content = db.TextProperty()
Foobar.filter('content =', content)
codes above will raise an Error because TextProperty can't be assigned a index and can't be matched.