Search API on Google App Engine

Question

I am running a proof of concept using App Engine and the built in Search API. We are testing the Search API under the assumption that it provides linear scaling as is the case with other products and services that are bundled with App Engine.

Specs: approx. 8 million documents in a single index
Query type: Complex queries, we need spatial queries based on square areas, not distance(!). All queries include 2 ranges based on latitude and longitude.
Page sizes: between 16 and 250.
Accuracy (result counting) set to 100 in all test cases.

Our target performance (latency) is in the 100's of milliseconds range.

We are testing performance of the Search API running several concurrent requests. Test results are now measured at about 25 concurrent requests, but this number is expected to go up significantly. However, if the Search API is properly scalable then this should be meaningless.

I am measuring the time it takes the Search API to process a call to Index.search(Query). What I am measuring is the following:

The average time it takes the search method to return is around 8000 ms. There are no cases in which the method returns significantly faster or slower than that. However, using an index with 10 documents results in latency measurements of around 300 ms (!!!). This might be an indication that the Search API is not scalable at all.
The page size does not seem to make any significant differences. Perhaps at page sizes of 10.000 or higher it will, but this is not part of our tests.
Adding one criteria (equality) seems to speed up the search significantly. Up to approximately 40% improvement. This seems like a nice improvement, but 4 seconds is still an eternity.

Questions:

What is the expected latency (best possible scenario/configuration) that the Search API can deliver?
Which parameters influence latency including app engine configuration.
Does the number of documents in an index influence latency?
Is a search based on 2 range queries slower than a search based on equality filters alone? (because we could pre-process the data and add 'index' data to each document).
Is the Search API really scalable?

Mo'in Creemers Mo'in Creemers · Accepted Answer · 2015-01-15T19:10:10

Our application for this was to plot a number of markers on a map using a tile server. However, the tile server performs many queries (i.e. 'tiles') in parallel, almost 30 per user/view. To make things difficult, we were not able to solve this problem using pre-aggregated maps because we have too many parameters/dimensions to take care of (if this is the case for you then try: Google Maps Engine).

So, we ended up with a CloudSQL instance set to the highest tier for max. performance. Another reason to use a relational database is that index performance is more precisely tunable as opposed to the Search API or BigQuery.

To answer the questions, this is what we found:

The latency depends on the size of the index. At lower volumes per index the latency seems reasonable. At much higher volumes this may become a problem. But for text-searched this is probably ok in most cases.
We did not test at lower volumes but at around 8 million documents, the latency sits between 5000 - 8000 ms. per query. We did not find any parameters that decreased latency, we did find parameters that increased latency.
Yes.
We did not test this.
Yes.

Search API on Google App Engine

1 Answers