3
votes

I am having an issue querying an field (title) using query string regex.

This works: "title:/test/"
This does not : "title:/^test$/"

However they mention it is supported https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-regexp-query.html#regexp-syntax

My goal it to do exact match, but this match should not be partial, it should match the whole field value.

Does anybody have an idea what might be wrong here?

2
I think for exact matching you can use the default Match Querystevenll

2 Answers

4
votes

From the documentation

The Lucene regular expression engine is not Perl-compatible but supports a smaller range of operators.

You are using anchors ^ and $, which are not supported because there is no need for that, again from the docs

Lucene’s patterns are always anchored. The pattern provided must match the entire string

If you are looking for the phrase query kind of match you could use double quotes like this

{
  "query": {
    "query_string": {
      "default_field": "title",
      "query": "\"test phrase\""

    }
  }
}

but this would also match documents with title like test phrase someword

If you want exact match, you should look for term queries, make your title field mapping "index" : "not_analyzed" or you could use keyword analyzer with lowercase filter for case insensitive match. Your query would look like this

{
  "query": {
    "term": {
      "title": {
        "value": "my title"
      }
    }
  }
}

This will give you exact match

3
votes

Usually in Regex the ^ and $ symbols are used to indicate that the text is should be located at the start/end of the string. This is called anchoring. Lucene regex patterns are anchored by default.

So the pattern "test" with Elasticsearch is the equivalent of "^test$" in say Java.

You have to work to "unanchor" your pattern, for example by using "te.*" to match "test", "testing" and "teeth". Because the pattern "test" would only match "test".

Note that this requires that the field is not analyzed and also note that it has terrible performance. For exact match use a term filter as described in the answer by ChintanShah25.