0
votes

I am new to Elasticsearch, and right now I am trying to figure out why my synonyms are not returning any results like I expect them to.

I created a custom filter and analyzer for my synonyms file and applied the analyzer to both the _all field and explicitly defined the specialty field to use it as well.

When I search for "specialty": "aids" without the analyzer/tokenizer, it gives me zero results as expected.

However, when I search for "specialty": "aids" with the analyzer/tokenizer, I expect it to give me the same results as searching for "speciality": "retrovirology", which should yields 3 results, but it comes back with nothing.

Is there something wrong with how I am approaching this?


Here are my settings and some sample data:

curl -XDELETE "http://localhost:9200/personsearch"

curl -XPUT "http://localhost:9200/personsearch" -d'
{
  "settings": {
    "index": {
      "analysis": {
        "analyzer": {
          "XYZSynAnalyzer": {
            "tokenizer": "standard",
            "filter": [
              "XYZSynFilter"
            ]
          }
        },
        "filter": {
          "XYZSynFilter": {
            "type": "synonym",
            "synonyms": [
              "aids, retrovirology"
            ]
          }
        }
      }
    }
  },
  "mappings": {
    "xyzemployee": {
      "_all": {
        "analyzer": "XYZSynAnalyzer"
      },
      "properties": {
        "firstName": {
          "type": "string"
        },
        "lastName": {
          "type": "string"
        },
        "middleName": {
          "type": "string",
          "include_in_all": false,
          "index": "not_analyzed"
        },
        "specialty": {
          "type": "string",
          "analyzer": "XYZSynAnalyzer"
        }
      }
    }
  }
}'

curl -XPUT "http://localhost:9200/personsearch/xyzemployee/1" -d'
{
  "firstName": "Don",
  "middleName": "W.",
  "lastName": "White",
  "specialty": "Adult Retrovirology"
}'

curl -XPUT "http://localhost:9200/personsearch/xyzemployee/2" -d'
{
  "firstName": "Terrance",
  "middleName": "G.",
  "lastName": "Gartner",
  "specialty": "Retrovirology"
}'

curl -XPUT "http://localhost:9200/personsearch/xyzemployee/3" -d'
{
  "firstName": "Carter",
  "middleName": "L.",
  "lastName": "Taylor",
  "specialty": "Pediatric Retrovirology"
}'

# Why is this returning nothing?
curl -XGET "http://localhost:9200/personsearch/xyzemployee/_search?pretty=true" -d'
{
  "query": {
    "match": {
      "specialty": "retrovirology"
    }
  }
}'
1

1 Answers

1
votes

You aren't lowercasing anywhere. Try this:

{
 "settings": {
   "index": {
     "analysis": {
       "analyzer": {
         "XYZSynAnalyzer": {
           "tokenizer": "standard",
           "filter": [
             "lowercase", "XYZSynFilter"
           ]
         }
       },
       "filter": {
         "XYZSynFilter": {
           "type": "synonym",
           "synonyms": [
             "aids, retrovirology"
           ]
         }
       }
     }
   }
 }

Note: you may want to split your index analyzer and search analyzer, and choose only one of them to do the synonyms. Expanding them only during indexing will speed search results.