0
votes

how to use multiple match phrases in must with or condition in elastic search?

GET questiondetails/question/_search
{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "tags.keyword": "azure-data"
          }
        },
        {
          "match_phrase": {
            "body": "azure-data"
          }
        },
        {
          "match_phrase": {
            "title": "azure-data"
          }
        },
        {
          "match_phrase": {
            "answers.body": "azure-data"
          }
        }
      ],
      "minimum_should_match": 1,
      "filter": {
        "range": {
          "creation_date": {
            "gte": 1584748800,
            "lte": 1585612800
          }
        }
      }
    }
  },
  "size": "10000"
}

tried this query .... I need results in where the word exactly matches in tags or partial match on body or title or answer.body

But its not working .

Adding comment

GET questiondetails_new/question/_search? { "query": { "bool": { "should": [{ "match_phrase": { "tags.keyword": "azure-data-factory" } }, { "match_phrase": { "title": "azure-data-factory" } } ], "minimum_should_match": 1, "filter": { "range": { "creation_date": { "gte": 1585170170, "lte": 1585170180 } } }

In this query i need all the docs which has exact match as azure-data-factory or which has azure-data-factory in its titile (String ). It should be an or search . But its matching with the tags which has the value as azure-data-factory-2 also

1
Expectation is to retrieve documents which have "azure-data" in title or body or tags.keyword.. ?Abishek ram R

1 Answers

0
votes

Add a char filter which will replace "-" with "_" while indexing. You won't need to change input text(it will work with "-")

PUT index38
{
  "settings": {
    "analysis": {
      "char_filter": {
        "my_char_filter": {
          "type": "mapping",
          "mappings": [
            "- => _"
          ]
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "char_filter": [
            "my_char_filter"
          ],
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
      "properties": {
        "tags": {
          "type": "text",
          "analyzer": "my_analyzer",
          "fields": {
            "keyword":{
              "type":"keyword"
            }
          }
        },
        "title": {
          "type": "text",
          "analyzer": "my_analyzer",
          "fields": {
            "keyword":{
              "type":"keyword"
            }
          }
        }
      }
    }
}

Query:

{
  "query": {
    "bool": {
      "should": [
        {
          "match_phrase": {
            "tags": "azure-data-factory"
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "match_phrase": {
                  "tags": "azure-data-factory"
                }
              }
            ],
            "should": [
              {
                "match_phrase": {
                  "body": "azure-data-factory"
                }
              },
              {
                "match_phrase": {
                  "title": "azure-data-factory"
                }
              },
              {
                "match_phrase": {
                  "answers.body": "azure-data-factory"
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  "size": "100"
}

If you will analyze text "azure-data-factory" with default analyzer, it will generate 3 tokens ["azure","data","factory"]

GET index38/_analyze
{
  "text": "azure-data-factory"
}

Result:

"tokens" : [
    {
      "token" : "azure",
      "start_offset" : 0,
      "end_offset" : 5,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "data",
      "start_offset" : 6,
      "end_offset" : 10,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "factory",
      "start_offset" : 11,
      "end_offset" : 18,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]

If you will analyze same text with "my_analyzer" a single token is generated

GET index38/_analyze
{
  "text": "azure-data-factory",
  "analyzer": "my_analyzer"
}

Result:

"tokens" : [
    {
      "token" : "azure_data_factory",
      "start_offset" : 0,
      "end_offset" : 18,
      "type" : "<ALPHANUM>",
      "position" : 0
    }
]