0
votes

I have doc with multiple nested documents. The nested queries work fine but they would still return all the nested objects (i.e. the whole document) even though search query would only match a few nested objects. It does however filter the documents as a whole.

Here is an example:

PUT /demo
{
  "mappings": {
    "company": {
      "properties": {
        "employees": {
          "type": "nested"
        }
      }
    }
  }
}

PUT
/demo/company/1
{
  "id": 1,
  "name": "Google",
  "emp_count": 3,
  "employees": [{
    "id": 1,
    "name": "John",
    "address": {
      "city": "Mountain View",
      "state": "California",
      "country": "United States"
    }
  }]
}

PUT
/demo/company/2
{
  "id": 1,
  "name": "Facebook",
  "emp_count": 3,
  "employees": [{
    "id": 1,
    "name": "Amber",
    "address": {
      "city": "Bangalore",
      "state": "Karnataka",
      "country": "India"
    }
  }, {
    "id": 1,
    "name": "Adrian",
    "address": {
      "city": "Palo Alto",
      "state": "California",
      "country": "United States"
    }
  }]
}

PUT
/demo/company/3
{
  "id": 1,
  "name": "Microsoft",
  "emp_count": 3,
  "employees": [{
    "id": 1,
    "name": "Aman",
    "address": {
      "city": "New York",
      "state": "New York",
      "country": "United States"
    }
  }]
}

When searching for India in the address, I should ideally only get Facebook with one nested object, but I get all the nested objects. How can I filter the nested objects returned?

Example query:

{
  "query": {
    "function_score":{
      "query":{
        "nested":{
         "path":"employees",
         "score_mode":"max",
         "query": {
            "multi_match":{
              "query":"India",
              "type":"cross_fields",
              "fields":[
                "employees.address.city",
                "employees.address.country",
                "employees.address.state"
              ]
            }
          }
        }
      }
    }
  }
}

Output of this query is Facebook with all employees while I only want Amber.

1

1 Answers

0
votes

You can use inner_hits to obtain the desired result.Use the below query:

GET /demo/company/_search
{
"query" : {
    "nested" : {
        "path" : "employees",
        "query" : {
            "match" : {"employees.address.country" : "India"}
        },
        "inner_hits" : {} 
    }
  }
}

Output will be:

"hits": {
  "total": 1,
  "max_score": 1.4054651,
  "hits": [
     {
        "_index": "demo",
        "_type": "company",
        "_id": "2",
        "_score": 1.4054651,
        "_source": {
           "id": 1,
           "name": "Facebook",
           "emp_count": 3,
           "employees": [
              {
                 "id": 1,
                 "name": "Amber",
                 "address": {
                    "city": "Bangalore",
                    "state": "Karnataka",
                    "country": "India"
                 }
              },
              {
                 "id": 1,
                 "name": "Adrian",
                 "address": {
                    "city": "Palo Alto",
                    "state": "California",
                    "country": "United States"
                 }
              }
           ]
        },
        "inner_hits": {
           "employees": {
              "hits": {
                 "total": 1,
                 "max_score": 1.4054651,
                 "hits": [
                    {
                       "_index": "demo",
                       "_type": "company",
                       "_id": "2",
                       "_nested": {
                          "field": "employees",
                          "offset": 0
                       },
                       "_score": 1.4054651,
                       "_source": {
                          "id": 1,
                          "name": "Amber",
                          "address": {
                             "city": "Bangalore",
                             "state": "Karnataka",
                             "country": "India"
                          }
                       }
                    }
                 ]
              }
           }
        }
     }
    ]
  }

You can see, inner_hits section has only those employees which match the criteria. But inner_hits was introduced in elasticsearch 1.5.0. So version should be greater than elasticsearch 1.5.0. You can refer here for more information.