0
votes

I am querying elastic (v6.7) for items that match the phrase "x-ray" with the query below:

POST item/_search
{
  "query": {
    "bool": {
      "must": {
        "multi_match": {
          "type": "phrase_prefix",
          "query": "X-Ray",
          "fields": [
            "mpn", 
            "product_description"
            "manufacturer_name"
          ], 
          "operator": "and",
          "analyzer": "standard"
        }
      }
    }
  }
}

The result set is empty.

I have item documents that contain the phrase "x-ray". For example if I query:

GET items/_doc/3e4a2d80-9d5e-11e7-a6c5-6ddf18575461

It returns:

{
    "_index": "items",
    "_type": "_doc",
    "_id": "3e4a2d80-9d5e-11e7-a6c5-6ddf18575461",
    "_version": 1,
    "_seq_no": 7605,
    "_primary_term": 1,
    "found": true,
    "_source": {
        "manufacturer_name": "GE",
        "var_pricing": 0,
        "on_hand": 1,


        ...

        "product_description": "Portable X-Ray w/Fuji CR Reader", <----This should be a match!
        "project_id": null,
        "user_id": "12",
        "quote_items": [],
        "parentCategory": [
            0
        ]
    }
}

If I run a query on a freshly installed version of elastic (v7.3) where I add three documents like so:

POST product/_bulk
{"index":{"_id":1001}}
{"name":"x-ray Machine","price":152000,"in_stock":38,"sold":47,"tags":["Alcohol","Wine"],"description":"x-ray machine for x-rays","is_active":true,"created":"2004\/05\/13"}
{"index":{"_id":1002}}
{"name":"X-Ray film","price":99,"in_stock":10,"sold":430,"tags":[],"description":"just some x-ray film","is_active":true,"created":"2007\/10\/14"}
{"index":{"_id":1003}}
{"name":"Table","price":2500,"in_stock":24,"sold":215,"tags":[],"description":"could be used for an x-ray table","is_active":true,"created":"2000\/11\/17"}

Then query with:

POST product/_search
{
  "query": {
    "bool": {
      "must": {
        "multi_match": {
          "type": "phrase_prefix",
          "query": "X-Ray",
          "fields": [
            "name", 
            "description"
          ], 
          "operator": "and",
          "analyzer": "standard"
        }
      }
    }
  }
}

All three items are returned:

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 31.876595,
    "hits" : [
      {
        "_index" : "product",
        "_type" : "default",
        "_id" : "1001",
        "_score" : 31.876595,
        "_source" : {
          "name" : "x-ray Machine",
          "price" : 152000,
          "in_stock" : 38,
          "sold" : 47,
          "tags" : [
            "Alcohol",
            "Wine"
          ],
          "description" : "x-ray machine for x-rays",
          "is_active" : true,
          "created" : "2004/05/13"
        }
      },
      {
        "_index" : "product",
        "_type" : "default",
        "_id" : "1002",
        "_score" : 27.347116,
        "_source" : {
          "name" : "X-Ray film",
          "price" : 99,
          "in_stock" : 10,
          "sold" : 430,
          "tags" : [ ],
          "description" : "just some x-ray film",
          "is_active" : true,
          "created" : "2007/10/14"
        }
      },
      {
        "_index" : "product",
        "_type" : "default",
        "_id" : "1003",
        "_score" : 25.889376,
        "_source" : {
          "name" : "Table",
          "price" : 2500,
          "in_stock" : 24,
          "sold" : 215,
          "tags" : [ ],
          "description" : "could be used for an x-ray table",
          "is_active" : true,
          "created" : "2000/11/17"
        }
      }
    ]
  }
}

What gives?

I used the explain API to get some more insight but all it says is that there isn't a match:

POST items/_doc/3e4a2d80-9d5e-11e7-a6c5-6ddf18575461/_explain
{
  "query": {
    "bool": {
      "must": [
        {
            "multi_match": {
              "type": "phrase_prefix",
              "query": "X-Ray",
              "fields": [
                "product_description",
                "mpn",
                "manufacturer_name"
              ], 
              "operator": "and",
              "analyzer": "standard"
        }}
        ]
      }
    }
  }
}

Returns:

{
    "_index": "items",
    "_type": "_doc",
    "_id": "3e4a2d80-9d5e-11e7-a6c5-6ddf18575461",
    "matched": false,
    "explanation": {
        "value": 0,
        "description": "Failure to meet condition(s) of required/prohibited clause(s)",
        "details": [
            {
                "value": 0,
                "description": "no match on required clause (((+product_description:x +product_description:ray) | (+mpn:x +mpn:ray) | (+manufacturer_name:x +manufacturer_name:ray)))",
                "details": [
                    {
                        "value": 0,
                        "description": "No matching clause",
                        "details": []
                    }
                ]
            },
            {
                "value": 0,
                "description": "no match on required clause (MatchNoDocsQuery(\"Type list does not contain the index type\"))",
                "details": [
                    {
                        "value": 0,
                        "description": "MatchNoDocsQuery(\"Type list does not contain the index type\") doesn't match id 12556",
                        "details": []
                    }
                ]
            }
        ]
    }
}

Not much changes when I change the analyzer to whitespace or keyword either.

2
can you show your index mapping ??user156327
@AmitKhandelwal I have figured out that the reason it keeps using the standard analyzer is because it was set in the mappings to use a custom analyzer (which used the standard analyzer).Noah Gary

2 Answers

1
votes

( this is not answer but I could not type all this up in a comment)

I am not sure you really needed to use analyzer with your query if you intended to match X-Ray as a whole.

look at this

POST _analyze
{
  "analyzer": "standard", 
  "text":"X-Ray"
}

and the response is

{
  "tokens" : [
    {
      "token" : "x",
      "start_offset" : 0,
      "end_offset" : 1,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "ray",
      "start_offset" : 2,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 1
    }
  ]
}

so your search term X-Ray became x and ray. Is this what you intended?

0
votes

So I determined my problem was that the standard analyzer is being applied all the time because it was set in the mappings to use a custom analyzer (which used the standard analyzer).

shown here:

GET items/_mapping

shows

...
"manufacturer_name": {
    "type": "text",
    "fields": {
        "raw": {
            "type": "keyword",
            "normalizer": "lowercase_normalizer"
        }
    },
    "analyzer": "my_search_analyzer",
    "search_analyzer": "standard"
},
...

This is the same for the other two index fields I was querying for.

The lesson here:

Check the mappings to assure no custom analyzers have been set for certain fields if you are having issues with search.