I have an elasticsearch index, my_index, with millions of documents, with key my_uuid. On top of that index I have several filtered aliases of the following form (showing only my_alias as retrieved by GET my_index/_alias/my_alias
):
{
"my_index": {
"aliases": {
"my_alias": {
"filter": {
"terms": {
"my_uuid": [
"0944581b-9bf2-49e1-9bd0-4313d2398cf6",
"b6327e90-86f6-42eb-8fde-772397b8e926",
thousands of rows...
]
}
}
}
}
}
}
My understanding is that the filter will be cached transparently for me, without having to do any configuration. The thing is I am experiencing very slow searches, when going through the alias, which suggests that 1. the filter is not cached, or 2. it is wrongly written.
Indicative numbers:
GET my_index/_search -> 50ms
GET my_alias/_search -> 8000ms
I can provide further information on the cluster scale, and size of data if anyone considers this relevant.
I am using elasticsearch 2.4.1. I am getting the right results, it is just the performance that concerns me.
my_uuid
isnot_analyzed
? But thousands of terms on a filter seems quite heavy weight. If you know these uuids at index time you could add a new fieldaliases
to each doc. Then your filter would just have a single term. – NikoNyrhmy_uuid
isnot_analyzed
. Indeed I know them at index time, but they are dynamically updated in bulk, so I did not want to hard code them into the searchable documents. – yannisfmy_uuid
s, and just uploading the query takes about 6 seconds. So I guess this is not considered a viable solution. – yannisf