16
votes

Suppose, in ElasticSearch 5, I have data with nesting like:

{"number":1234, "names": [ 
  {"firstName": "John", "lastName": "Smith"}, 
  {"firstName": "Al", "lastName": "Jones"}
]},  
...

And I want to query for hits with number 1234 but return only the names that match "lastName": "Jones", so that my result omits names that don't match. In other words, I want to get back only part of the matching document, based on a term query or similar.

A simple nested query won't do, as such would be filtering top-level results. Any ideas?

{ "query" : { "bool": { "filter":[
    { "term": { "number":1234} },
    ????  something with "lastName": "Jones" ????
] } } }

I want back:

hits: [
   {"number":1234, "names": [ 
     {"firstName": "Al", "lastName": "Jones"}
   ]},  
   ...
]
3
The second answer should get you what you need, right? - Val
did you find a good solution for your purpose? accepted answer doesnt seem to be solution as you also commented down below. I also need exactly same filtering on nested objects but those inner hits are returned separately and entire nested object list is also returned. Is it maybe even not possible with nested objects? did you end up parent-child ? see my questions also here stackoverflow.com/questions/48750696/… - Emil
I did not find exactly what I wanted. If I were in charge of elastic, I'd probably add this feature! - Patrick Szalapski
@Phillip Baumann's answer is what you need. Please check it and my comment. - Algorini

3 Answers

23
votes

hits section returns a _source - this is exactly the same document you have indexed.

You are right, nested query filters top-level results, but with inner_hits it will show you which inner nested objects caused these top-level documents to be returned, and this is exactly what you need.

names field can be excluded from top-level hits using _source parameter.

{
   "_source": {
      "excludes": ["names"]
   },
   "query":{
      "bool":{
         "must":[
            {
               "term":{
                  "number":{
                     "value":"1234"
                  }
               }
            },
            {
               "nested":{
                  "path":"names",
                  "query":{
                     "term":{
                        "names.lastName":"Jones"
                     }
                  },
                  "inner_hits":{
                  }
               }
            }
         ]
      }
   }
}

So now top-level documents are returned without names field, and you have an additional inner_hits section with the names that match.
You should treat nested objects as part of a top-level document. If you really need them to be separate - consider parent/child relations.

4
votes

Try something like this

{
   "query": {
      "filtered": {
         "query": {
            "match_all": {}
         },
         "filter": {
            "bool": {
               "must": [
                  {
                     { "term": { "number":1234} }
                  },
                  {
                     "nested": {
                        "path": "something",
                        "query": {
                           "term": {
                              "something.lastName": "Jones"
                           }
                        },
                        "inner_hits" : {}
                     }
                  }
               ]
            }
         }
      }
   }
}

I used this Refrence

3
votes

Similar but a bit different, use the should parameter and then look at inner hits for the names. This will return the top level doc and then inner_hits will have any hits.

   { 
      "_source": {
        "excludes": ["names"]
      },
       "query":{
          "bool":{
             "must":[
                {
                   "term":{
                      "number":{
                         "value":"1234"
                      }
                   }
                }
             ],
             should: [
             {
                "nested":{
                   "path":"names",
                   "query":{
                      "term":{
                         "names.lastName":"Jones"
                      }
                   },
                   "inner_hits":{
                   }
                }
             }

             ]
          }
       }
    }