I am running a proof of concept using App Engine and the built in Search API. We are testing the Search API under the assumption that it provides linear scaling as is the case with other products and services that are bundled with App Engine.
- Specs: approx. 8 million documents in a single index
- Query type: Complex queries, we need spatial queries based on square areas, not distance(!). All queries include 2 ranges based on latitude and longitude.
- Page sizes: between 16 and 250.
- Accuracy (result counting) set to 100 in all test cases.
Our target performance (latency) is in the 100's of milliseconds range.
We are testing performance of the Search API running several concurrent requests. Test results are now measured at about 25 concurrent requests, but this number is expected to go up significantly. However, if the Search API is properly scalable then this should be meaningless.
I am measuring the time it takes the Search API to process a call to Index.search(Query). What I am measuring is the following:
- The average time it takes the search method to return is around 8000 ms. There are no cases in which the method returns significantly faster or slower than that. However, using an index with 10 documents results in latency measurements of around 300 ms (!!!). This might be an indication that the Search API is not scalable at all.
- The page size does not seem to make any significant differences. Perhaps at page sizes of 10.000 or higher it will, but this is not part of our tests.
- Adding one criteria (equality) seems to speed up the search significantly. Up to approximately 40% improvement. This seems like a nice improvement, but 4 seconds is still an eternity.
Questions:
- What is the expected latency (best possible scenario/configuration) that the Search API can deliver?
- Which parameters influence latency including app engine configuration.
- Does the number of documents in an index influence latency?
- Is a search based on 2 range queries slower than a search based on equality filters alone? (because we could pre-process the data and add 'index' data to each document).
- Is the Search API really scalable?