I have a property edApp.name I query with match. I have confirmed that the mapping for is "type": "string" so it should be analyzed.
When I query with match, I get a different number of hits each time.
I see the same behaviour whether querying all documents with /_search or a subset through a read alias.
Newer update: A dynamically mapped field seems to be the culprit. The field is generated.edApp.name and it gets dynamically mapped with "not_analyzed". As soon as a document with this field is indexed, the analyzer for edApp.name breaks and I start seeing the weird results with match queries.
document:
{
@context: "http://purl.imsglobal.org/ctx/caliper/v1/Context",
edApp: {
name: "ReadingRainbow"
}
}
mapping:
"dynamic_templates": [
{
"string_theory": {
"mapping": {
"index": "not_analyzed",
"type": "string",
"doc_values": true
},
"match": "*",
"match_mapping_type": "string"
}
},
{
"i_dont_know_you": {
"mapping": {
"enabled": false
},
"match_mapping_type": "object",
"path_match": "*.extensions.*"
}
}
],
"properties": {
"_all": {
"enabled": false
},
"_timestamp": {
"enabled": true
},
...
"edApp": {
"properties": {
"name": {
"type": "string"
}
}
}
}
query returning inconsistent results:
{
"query": {
"match": {
"edApp.name": "ReadingRainbow"
}
}
}
hits.total values when running query multiple times: [44, 56, 57, 69]
term query returning inconsistent results:
{
"query": {
"bool": {
"should": [
{
"term": {
"edApp.name": "ReadingWonders2.0"
}
}
]
}
}
}
hits.total values when running term query multiple times: [21, 33, 34, 46]
Other term query returning inconsistent results (note lower case):
{
"query": {
"bool": {
"should": [
{
"term": {
"edApp.name": "readingwonders2.0"
}
}
]
}
}
}
hits.total values when running term query multiple times: [44, 56, 57, 69] NOTE: these are the same counts we saw with the match query!
query with both terms:
{
"query": {
"bool": {
"should": [
{
"term": {
"edApp.name": "readingwonders2.0"
}
},
{
"term": {
"edApp.name": "ReadingWonders2.0"
}
}
]
}
}
}
hits.total values are consistent: 79 results
As you can see, the inconsistent hits from lowercase, and camelcase term searches add up to 79 documents. Could the analyzer be creating this inconsistency?
I am using AWS Elasticsearch Service ES 1.5.2
preference
in the query or routing? – Andrei Stefan