6
votes

I am using elasticsearch completion suggester thesedays, and got some problem that it always produce similar results.

Say I search with the following statement:

    "my_suggestion": {
>         "text": "ni",
>         "completion": {
>             "field": "my_name_for_sug"
>         }
>     }

And get the following results:

 "my_suggestion" : [ {
    "text" : "ni",
    "offset" : 0,
    "length" : 2,
    "options" : [ {
      "text" : "Nine West",
      "score" : 329.0
    }, {
      "text" : "Nine West ",
      "score" : 329.0
    }, {
      "text" : "Nike",
      "score" : 295.0
    }, {
      "text" : "NINE WEST",
      "score" : 168.0
    }, {
      "text" : "NINE WEST ",
      "score" : 168.0
    } ]
  } ],

So the question is how can I merge or aggregate the same results like "NINE WEST" and "NINE WEST ".

the mapping is:

    "my_name_for_sug": {
         "type": "completion"
        ,"analyzer": "ik_max_word"
        ,"search_analyzer": "ik_max_word"
        ,"payloads": true
        ,"preserve_separators": false  
    }

where ik_max_word is an chinese-specific analyzer, and it can do the standard analyzer's job.

Thanks

1

1 Answers

0
votes

Elastic Suggesters automatically de-duplicate same output (at least till 2.x). I haven't tried out 5.x yet, and there's some changes in suggesters there. The problem seems to be your index analyzer, which is indexing your documents so that:

"text" : "Nine West", "text" : "Nine West ", "text" : "NINE WEST", "text" : "NINE WEST ", aren't exactly the same. You need to index them using an analyzer which lowercases the tokens, and strips extra spaces etc. Once you do that, you should get de-duplicated output for suggestions, like you want.