0
votes

I was wondering if it is possible in Elasticsearch to exclude queries where the query is a single term? I am trying to use "minimum_should_match" as 2, which works well when the query has 2 or more terms. However, if the number of terms in the query is 1, ES will still return results. It seems that ES is using the logic of "well you asked for a minimum of matching two terms, yet there is only one term to match; we'll lower the minimum to 1". Is there a way to turn this functionality off, or otherwise do what I am looking for?

For those wondering why this can't be done at the API level, I am using a query analyzer that excludes stop words. So a query like "a ipad" would end up being 1 term, while the API would see 2. The API could do stopword filtering but that seems to be a waste of resources.

1
Seems like in the documentation: "(ie: no matter how low or how high the result of the calculation result is, the minimum number of required matches will never be lower than 1 or greater than the number of clauses)". Seems like maybe minimum_should_match isn't what I need for thismrquintopolous
what exactly are you looking at?minimum_clause or minimum_terms? could you provide couple of documents?ChintanShah25
What exactly are minimum_clause or minimum_terms? A google search turned up nothing. I am serching for a way of saying "there needs to be more than x terms in a query before doing a search". I would provide documents but they don't really have anything to do with the question at hand. The minimum_should_match param work well. When set to 2, 2 query terms must match a document for it to be returned. But this only works when there are 2 or more terms in the query! When there is one term in the query, ES still matches documents.mrquintopolous
sorry for the confusion about minimum_clause and minimum_terms, I will make it simple, if your query is apple steve jobs you want at least 2 of the three terms to match and exclude any document with only one term, right?ChintanShah25

1 Answers

1
votes

Before doing a query you can first analyze the input by your custom analyzer. You can use the Analyze API for this (be sure to set the analyzer property to be equal to your custom analyzer name).

The result would be a list of analyzed tokens. If your analyzer removes stopwords, it would return only ipad for a ipad.

So if the Analyze API returns only one token you actually don't need to query Elasticsearch, because you don't want any results if number of tokens is less than 2 (if I understood you correctly)