1
votes

Mapping in amazon index looks like below. book has nested authors that have the multi-field "alias". "alias" and "alias.raw" should be valid fields in authors.

{
    mappings: {
        book: {
            properties: {
                title: {
                    type: "string",
                    fields: {
                        raw: {type: "string", index: "not_analyzed"}
                    }
                },
                authors: {
                    type: "nested",
                    properties: {
                        alias: {
                            type: "string",
                            fields: {
                                raw : {
                                    type: "string",
                                    index: "not_analyzed"
                                }
                            }
                        },
                        alias_raw: {
                            type: "string",
                            index: "not_analyzed"
                        }

                    }
                }
            }
        }
    }
}

Sample data:

{index: {_id: "1", _type: "book"}} {title: "Being Awesome for Dummies", pages: "100", authors:[{firstName: "Apollo", lastName: "Cabrera", alias: "Mister Awesome", alias_raw: "Mister Awesome"}, {firstName:"Mark", lastName:"Twain", alias: "Julius Caesar", alias_raw: "Julius Caesar"}]}

{index: {_id: "2", _type: "book"}} {title: "Understanding Women", pages: "100000", authors:[{firstName: "Megyn", lastName: "Kelly", alias:"Wonder Woman", alias_raw:"Wonder Woman"}, {firstName:"Donald", lastName:"Trump", alias:"Lone Ranger", alias_raw:"Lone Ranger"}]}

{index: {_id: "3", _type: "book"}} {title: "Snap Chat", pages: "30", authors:[{firstName: "Hilary", lastName: "Clinton", alias:"Code Zero", alias_raw:"Code Zero"}, {firstName:"Harry", lastName:"Houdini", alias: "Abra Cadabra", alias_raw: "Abra Cadabra"}]}

My aggregation query looks like this...

{
    query: {
        match_all: {}
    },
    aggs: {
        authors: {
            nested: {
                path: "authors"
            },
            aggs: {
                aliases: {
                    terms: {
                        field: "alias.raw"
                    }
                }   
            }
        }
    }
}

When I execute the aggregation query, I get no aggregation at all. If I use "alias_raw" (which is another separate field I added), I get the full alias as expected...

"aggregations" : {
    "authors" : {
      "doc_count" : 6,
      "aliases" : {
        "doc_count_error_upper_bound" : 0,
        "sum_other_doc_count" : 0,
        "buckets" : [ {
          "key" : "Abra Cadabra",
          "doc_count" : 1
        }, {
          "key" : "Code Zero",
          "doc_count" : 1
        }, {
          "key" : "Julius Caesar",
          "doc_count" : 1
        }, {
          "key" : "Lone Ranger",
          "doc_count" : 1
        }, {
          "key" : "Mister Awesome",
          "doc_count" : 1
        }, {
          "key" : "Wonder Woman",
          "doc_count" : 1
        } ]
      }
    }
  }

Is there another way to get at "alias.raw" for aggregating a multi-field? If I use just "alias", I get the parsed, tokenized, analyzed, lower-case fields. I want the non-analyzed raw alias. I really don't think I need nor want the redundant "alias_raw" field.

Thanks in advance :)

1

1 Answers

2
votes

Figured it out. You have to use a full path for the field name...

field: "authors.alias.raw"

i.e.

{
    query: {
        match_all: {}
    },
    aggs: {
        authors: {
            nested: {
                path: "authors"
            },
            aggs: {
                aliases: {
                    terms: {
                        field: "authors.alias.raw"
                    }
                }   
            }
        }
    }
}

Owe right :)