I have a field that is analyzed using whitespace tokenizer and both lowercase and asciifolding filters. I'm trying to run a query that contains both a prefix and a wildcard. We are replacing a homegrown search engine using pure Lucene with ElasticSearch and similar queries did work with Lucene syntax but don't work in ElasticSearch.
For example, this query will find all documents that have "smith john" in the field "name".
{
"query": {
"simple_query_string": {
"query": "\"smith john\"",
"fields": ["name"],
"default_operator": "AND"
}
}
}
However, I also want to find "smith johnny", "smith john a", etc. In our Lucene code we simply added the prefix operator to do this search, but it produces 0 results.
{
"query": {
"simple_query_string": {
"query": "\"smith joh*\"",
"fields": ["name"],
"default_operator": "AND"
}
}
}
If I leave out the quotes, I get results, but that includes documents with both names such as "smith barry" and "wilson john" indexed in the same document. I only want names like "smith john" and "smith johnny", etc.
I have tried variations of query_string as well with similar results.
I know I can use "match_phrase_prefix" to search for "smith joh", but that comes with its own limitations such as restricting use of wildcards and needing to know or guess at a value for max_expansions.
What do I need to change to get results from the second query? Thank you.