14
votes

I'm using elasticsearch and am having a devil of a time getting an exact match to happen. I've tried various combinations of match, query_string, etc, and I either get nothing or bad results. Query looks like this:

{
  "filter": {
    "term": {
      "term": "dog",
      "type": "main"
    }
  },
  "query": {
    "match_phrase": {
      "term": "Dog"
    }
  },
  "sort": [
    "_score"
  ]
}

Sorted results

10.102211 {u'term': u'The Dog', u'type': u'main', u'conceptid': 7730506}
10.102211 {u'term': u'That Dog', u'type': u'main', u'conceptid': 4345664}
10.102211 {u'term': u'Dog', u'type': u'main', u'conceptid': 144}
7.147442 {u'term': u'Dog Eat Dog (song)', u'type': u'main', u'conceptid': u'5288184'}

I see, of course that "The Dog", "That Dog" and "Dog" all have the same score, but I need to figure out how I can boost the exact match "Dog" in score.

I also tried

{
  "sort": [
    "_score"
  ],
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "term": "Dog"
          }
        },
        {
          "match_phrase": {
            "term": {
              "query": "Dog",
              "boost": 5
            }
          }
        }
      ]
    }
  },
  "filter": {
    "term": {
      "term": "dog",
      "type": "main"
    }
  }
}

but that still just gives me

11.887239 {u'term': u'The Dog', u'type': u'main', u'conceptid': 7730506}
11.887239 {u'term': u'That Dog', u'type': u'main', u'conceptid': 4345664}
11.887239 {u'term': u'Dog', u'type': u'main', u'conceptid': 144}
8.410372 {u'term': u'Dog Eat Dog (song)', u'type': u'main', u'conceptid': u'5288184'}
3

3 Answers

14
votes

Fields are analyzed with the standard analyzer by default. If you would like to check exact match, you could store your field not analyzed also e.g:

"dog":{
            "type":"multi_field",
            "fields":{
                "dog":{
                    "include_in_all":false,
                    "type":"string",
                    "index":"not_analyzed",
                    "store":"no"
                },
                "_tokenized":{
                    "include_in_all":false,
                    "type":"string",
                    "index":"analyzed",
                    "store":"no"
                }
            }
        }

Then you can query the dog-field for exact matches, and dog._tokenized for analyzed queries (like fulltext)

0
votes

I think that your problem is that field term is being analyzed (check your mapping) with the standard analyzer and is filtering stopwords such as the or that. For that reason you get the same score for Dog and The Dog. So maybe you can solve your problem by configuring a custom analyzer => documentation page

0
votes

Hash two value which you need to search into hash key, then search it.