English analyzer (stemming) in ElasticSearch does not work

Question

I tried to apply a custom english analyzer, as well as the standard english analyzer in elasticsearch. My aim is especially to use stemming. So let's say I have following words in my documents: covers, impression.

Now, if I search for e.g. cover or impressive or impressions, I get 0 results. Only if I search for the exact terms "covers" or "impression" I will hit results.

This are my settings in elasticsearch (according to this documentation https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-lang-analyzer.html):

{
  "settings": {
    "analysis": {
      "filter": {
        "english_stop": {
          "type":       "stop",
          "stopwords":  "_english_" 
        },
        "english_stemmer": {
          "type":       "stemmer",
          "language":   "english"
        },
        "english_possessive_stemmer": {
          "type":       "stemmer",
          "language":   "possessive_english"
        }
      },
      "analyzer": {
        "rebuilt_english": {
          "tokenizer":  "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        }
      }
    }
  }
}

My mapping looks as follows:

"mapping": {
  "_doc": {
     "properties": {
        "title": {"type": "text",
                   "analyzer": "rebuilt_english"},
        "description: {"type": text"
                       "analyzer": "rebuilt_english"}
  }
 }
}

I also tried (according to a few different tutorials) to change the settings like this (I just add the changes here, not the full code again):

{
  "settings": {
    "analysis": {
    "analyzer: "rebuilt_english" {
    "type": "custom",
     "filter": #and so on...

Do I miss something here? As far as I understand, I need to set the settings for a specific analyzer in "settings", give it a name and then use that name in "mapping" properties, so every item is analyzed according to the settings set above.

I also tried to not set any specific settings and just set the analyzer properties (in mapping) for each item like:

"title": {"type": "text",
"analyzer": "english"}

Which also doesn't work (even when using filters like stemming).

I really tried to find a solution for hours, but I can't get it to work. Help would be much appreciated. Thanks!

UPDATE

This is the code I used to create the index (my latest try, according to my description I also tried other ways to apply the method):

PUT /my_index

{
  "settings": {
    "analysis": {
      "analyzer": {
        "rebuilt_english": {
          "type": "custom",
      "filter": {
        "english_stop": {
          "type": "stop",
          "stopwords": "_english"
        },
        "english_stemmer": {
          "type": "stemmer",
          "language": "english"
        },
        "english_possessive_stemmer": {
          "type": "stemmer",
          "language": "possessive_english"
        },
          "tokenizer": "standard",
          "filter": [
            "english_possessive_stemmer",
            "lowercase",
            "english_stop",
            "english_stemmer"
            ]
        }
      }
    }
  },
  "mappings": {
    "_doc": {
      "properties": {
        "title": { "type": "text",
          "analyzer": "rebuilt_english"
        },
        "description": { "type": "text",
                    "analyzer": "rebuilt_english"}
                    }
        }
      }
    }
}

Can you post actual index mapping? GET /index-name should return that. Maybe there was a mistake somewhere — Evaldas Buinauskas
When I do this, I actually notice that there isn't any analyzer mapped to my items, even though I did map the analyzer when creating my index. Only the type is correctly mapped. — runner2018
There you go. I think the issue is that you specified mappping, not mappings during index creation. — Evaldas Buinauskas
I will update my question and post the code I used to create the index in the end of my post. — runner2018

Evaldas Buinauskas Evaldas Buinauskas · Accepted Answer · 2019-01-24T08:15:24

Your issue was that you had your filter key, where you have all your named filters was in wrong place. It was placed within analyzer, but was supposed to be a sibling key to analyzer.

So my bet is that the following config should work as expected:

{
  "settings":{
    "analysis":{
      "filter":{
        "english_stop":{
          "type":"stop",
          "stopwords":"_english"
        },
        "english_stemmer":{
          "type":"stemmer",
          "language":"english"
        },
        "english_possessive_stemmer":{
          "type":"stemmer",
          "language":"possessive_english"
        }
      },
      "analyzer":{
        "rebuilt_english":{
          "type":"custom",
          "tokenizer":"standard",
          "filter":[
            "english_possessive_stemmer",
            "lowercase",
            "english_stop",
            "english_stemmer"
          ]
        }
      }
    },
    "mappings":{
      "_doc":{
        "properties":{
          "title":{
            "type":"text",
            "analyzer":"rebuilt_english"
          },
          "description":{
            "type":"text",
            "analyzer":"rebuilt_english"
          }
        }
      }
    }
  }
}

English analyzer (stemming) in ElasticSearch does not work

3 Answers