0
votes

Basically im currently creating fuzzy search for elasticsearch and i have two kinds of search to compare

One is auto fuzzy search

{
    "query": {
       "match": {
         "user": {
           "query": "test",
           "fuzziness": "AUTO"
         }
       }
    }
}

Others is a terms query matching with multiple typos

{
    "query" : {
        "terms" : {
            "user" : ["test", "testt", "tesr", "tst", ...]
        }
    }
}

assuming thre might be around 20s or more of the terms, what i want to know is, which one is more likely a better practice and better by performance, and how scalable is terms matching with a lot of keyword.

2

2 Answers

1
votes

Match Query:

  1. Analyzes the input string and constructs more basic queries from it.
  2. It is used when you need full text search functionality.
  3. You cam use it for partial match, token search, fuzzy logic

Term query:

  1. Matches exact terms.
  2. Should be used if searched text doesn't require any analysis i.e text has to be matched as it is.
  3. It is faster than match
0
votes

Lets start with Performance:

From the fuzzy docs:

To find similar terms, the fuzzy query creates a set of all possible variations, or expansions, of the search term within a specified edit distance. The query then returns exact matches for each expansion.

Meaning both queries will result in a similar execution, with that said the terms query does not analyze the phrases making it the more 'efficient' one, assuming you do indeed want a full exact match.

Better practice:

This is hard to answer without having more details about your product, data and use case.

With that said i feel like the terms query is the better solution, would you really like part to match with park? or resort matched with report? fuzziness is tricky to use blindly, if you do end up choosing this path i recommend you add some extra logic somewhere to deal with these outcomes.