2
votes

I have a nested document in elasticsearch like this:

{
  {
  "doctxt": "doca",
    "nested": [
      {
        "pos": 1,
        "txt": "terma"
      },
      {
        "pos": 2,
        "txt": "termb"
      },
      {
        "pos": 3,
        "txt": "termc"
      }
    ]
  },
  {
  "doctxt": "docb",
    "nested": [
      {
        "pos": 1,
        "txt": "termd"
      },
      {
        "pos": 2,
        "txt": "terma"
      },
      {
        "pos": 3,
        "txt": "termb"
      }
    ]
  }
}

Aggregation queries that do work:

Total count for a specific term (match_all query, aggregation on term txt), result:

terma: 2
termb: 2
termc: 1
termd: 1

Facets on txt field for a specific term (filter on term txt, aggregation on term txt), results:

terma: termb (2), termc (1), termd (1)
termb: terma (2), termc (1), termd (1)
termc: terma (1), termb (1)
termd: terma (1), termb (1)

What I can't do with this document is following:

Average pos for a specific term (I end up getting the average on the whole nested positions, in this case always 2 for any term), expected results:

terma: 1.5
termb: 2.5
termc: 3
termd: 1

Histogram for a specific term (not working for the same reason as above), expected results:

terma: pos 1 (1), pos 2 (1)
termb: pos 2 (1), pos 3 (1)
termc: pos 3 (1)
termd: pos 1 (1)

If I have flat documents (doca-1-terma, doca-2-termb, doca-3-termc, docb-1-termd, ...) I get the expected results. Is this a limit in aggregating nested documents and should I store the data in twice in a flat format and in the current nested format?

1

1 Answers

0
votes

No not really. You can achieve what you need to achieve with nested types. Nested types will make your nested data treat individually and get your expected results.

Try this Gist: https://gist.github.com/vaidik/051a197654fe4b0ecc80

Also read this article about relationships. I think you can achieve the same with Parent/Children docs but I haven't worked with them much.