0
votes

I'm trying to create a painless script to filter an array of nested fields by an array of custom params, by my for loop is throwing an error.

Mapping

    {
        "documents": {
            "mappings": {
                "document": {
                    "properties": {
                        "properties": {
                            "type": "nested",
                            "properties": {
                                "key": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                },
                                "value": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                }
                            }
                        },
                        "performances": {
                            "properties": {
                                "key": {
                                    "type": "text",
                                    "fields": {
                                        "keyword": {
                                            "type": "keyword",
                                            "ignore_above": 256
                                        }
                                    }
                                },
                                "value": {
                                    "type": "double"
                                }
                            }
                        },
                        "name": {
                            "type": "text",
                            "fields": {
                                "keyword": {
                                    "type": "keyword",
                                    "ignore_above": 256
                                }
                            }
                        },
                        "id": {
                            "type": "long"
                        }
                    }
                }
            }
        }
    }

_source

    "_source": {
        "properties": [{
                "value": [
                    "D"
                ],
                "key": "M"
            },
            {
                "value": [
                    "2019-12-31"
                ],
                "key": "DOCUMENT_DATE"
            },
            {
                "isMultiValue": false,
                "value": [
                    "Yes"
                ],
                "key": "ACTIVE_DOCUMENT"
            },
        ],
        "performances": [
            {
                "value": 123,
                "key": "performance1"
            },
            {
                "value": 234,
                "key": "performance3"
            },
            {
                "value": 345,
                "key": "performance5"
            },
            {
                "value": -456,
                "key": "someKey"
            },
            {
                "value": -567,
                "key": "someOtherKey"
            },
        ],
        "name": "documentName43",
        "id": "1234"
    }

The script looks like this:

    {
        "query": {
            "bool": {
                "filter": [{
                        "nested": {
                            "path": "properties",
                            "query": {
                                "bool": {
                                    "filter": [{
                                            "match": {
                                                "properties.key.keyword": "ACTIVE_DOCUMENT"
                                            }
                                        },
                                        {
                                            "match": {
                                                "properties.value": "yes"
                                            }
                                        }
                                    ]
                                }
                            }
                        }
                    },
                    {
                        "match": {
                            "id": "1234"
                        }
                    }
                ]
            }
        },
        "script_fields": {
            "nested_scores": {
                "script": {
                    "lang": "painless",
                    "source": "for (int i = 0; i < params['_source']['performances'].length; ++i) { if(params['_source']['performances'][i]['key'] == params['customFields'][i]) { return params['_source']['performances'][i]['value'];}}return 0;",
                    "params": {
                        "customFields": ["performance1", "performance3", "performance5"]
                    }

                }
            }
        },
        "_source": [
            "id",
            "name",
            "name."
        ]
    }

If I replace the "params['customFields'][i]" part with a simple string, it's working just fine, so I'm guessing my problem is somewhere around there, but can't figure exactly what it is.

On another note, any idea how to structure my query so that the result from the painless script is returned inside the "_source"?

In the end I'd like to do something like this:

"source": "for (int i = 0; i < params['_source']['performances'].length; ++i) {
                            for (int t = 0; t < params['customFields'].length; ++t) {
                                if(params['_source']['performances'][i]['key'] == params['customFields'][t]) {
                                    return params['_source']['performances'][i]['value']; 
                                }
                            }
                          }
                          return 0;"

But first I wanted to make it work with the code above.

I'm on ES 6.4 if that matters and I'm first trying to run the query using the "elasticsearch-head" chrome's plugin.

However the result there looks like (I've changed some fields like "kennzahlen" to "properties" for convenience in my example above)

{
    "took": 679,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 4,
        "skipped": 0,
        "failed": 1,
        "failures": [{
            "shard": 0,
            "index": "fsl_prd_products",
            "node": "xAUFwT0LRVeAuOktIr9gaA",
            "reason": {
                "type": "script_exception",
                "reason": "runtime error",
                "script_stack": [
                    "java.util.ArrayList.rangeCheck(ArrayList.java:653)",
                    "java.util.ArrayList.get(ArrayList.java:429)",
                    "if (params['_source']['kennzahlen'][i]['key'] == params['customFields'][i]) { ",
                    " ^---- HERE"
                ],
                "script": "for (int i = 0; i < params['_source']['kennzahlen'].length; ++i) { if (!params['_source'].containsKey('kennzahlen')) { return 0; } if (params['_source']['kennzahlen'][i]['key'] == params['customFields'][i]) { return params['_source']['kennzahlen'][i]['value']; } } return 0;",
                "lang": "painless",
                "caused_by": {
                    "type": "index_out_of_bounds_exception",
                    "reason": "Index: 3, Size: 3"
                }
            }
        }]
    },
    "hits": {
        "total": 1,
        "max_score": 0,
        "hits": []
    }
}
1
Can you add a sample document and mappingjaspreet chahal
Added examples for mapping and sourceNedko Georgiev

1 Answers

1
votes

The results of script_fields are never going to appear in the _source part of the response -- they're always separate.

Let's replicate your use case using ES 7.2.0:

Setting up the index + ingesting (no whitespace for brevity)

PUT docs
{"mappings":{"properties":{"properties":{"type":"nested","properties":{"key":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"value":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}}}},"performances":{"properties":{"key":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"value":{"type":"double"}}},"name":{"type":"text","fields":{"keyword":{"type":"keyword","ignore_above":256}}},"id":{"type":"long"}}}}

POST docs/_doc
{"properties":[{"value":["D"],"key":"M"},{"value":["2019-12-31"],"key":"DOCUMENT_DATE"},{"isMultiValue":false,"value":["Yes"],"key":"ACTIVE_DOCUMENT"}],"performances":[{"value":123,"key":"performance1"},{"value":234,"key":"performance3"},{"value":345,"key":"performance5"},{"value":-456,"key":"someKey"},{"value":-567,"key":"someOtherKey"}],"name":"documentName43","id":"1234"}

then searching

GET docs/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "properties",
            "query": {
              "bool": {
                "filter": [
                  {
                    "match": {
                      "properties.key.keyword": "ACTIVE_DOCUMENT"
                    }
                  },
                  {
                    "match": {
                      "properties.value": "yes"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "match": {
            "id": "1234"
          }
        }
      ]
    }
  },
  "script_fields": {
    "nested_scores": {
      "script": {
        "lang": "painless",
        "source": """
          for (int i = 0; i < params['_source']['performances'].length; ++i) { 
            if (params['_source']['performances'][i]['key'] == params['customFields'][i]) { 
              return params['_source']['performances'][i]['value'];
            }

          }
          return 0;
        """,
        "params": {
          "customFields": [
            "performance1",
            "performance3",
            "performance5"
          ]
        }
      }
    }
  },
  "_source": [
    "id",
    "name",
    "name."
  ]
}

the system yields

[
  {
    "_index":"docs",
    "_type":"_doc",
    "_id":"vOhj8HEBG_KW3EFn7wOf",
    "_score":0.0,
    "_source":{
      "name":"documentName43",
      "id":"1234"
    },
    "fields":{
      "nested_scores":[            <-------
        123
      ]
    }
  }
]

If your query is failing, you may want to try some validity checks:

...
"source": """
          if (!params['_source'].containsKey('performances')) {
            return 0
          }
          // rest of the script
"""
...

I'm unsure, though, about what you're trying to do though. If the condition in the loop is met, it's going to return the first match. So it may never run till, say, perf3 or perf5... Also, _source.performances may not be sorted so, conversely, it might return perf5 and exit.