I am getting some strange errors that are difficult to debug. I am running a simple UDF JavaScript mapper which maps the JSON data and imports it into BigQuery. I've run other UDF functions previously and never encountered such errors.
Is there any way to debug (with the actual debugger or at least with console.log or similar) the Dataflow templates UDF errors?
The error in question:
exception: "java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: java.lang.RuntimeException: java.lang.RuntimeException: org.json.JSONException: A JSONObject text must begin with '{' at 1 [character 2 line 1]
at com.google.cloud.dataflow.worker.GroupAlsoByWindowsParDoFn$1.output(GroupAlsoByWindowsParDoFn.java:183)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner$1.outputWindowedValue(GroupAlsoByWindowFnRunner.java:101)
at com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:54)
at com.google.cloud.dataflow.worker.util.BatchGroupAlsoByWindowReshuffleFn.processElement(BatchGroupAlsoByWindowReshuffleFn.java:37)
at com.google.cloud.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:114)
...
It's very difficult to say what this error is about: is this input data that is mis-formatted or output JSON from the UDF?
I've tried everything so far:
- Unit tested the UDF locally with a sample data
- Run the integration tests with the exact same file I try to analyse in the real environment
- Used an empty JSON on the input (with empty object
{}) - Used a UDF function that returns an empty JSON object
Any tips on debugging Dataflow UDF Javascript would be highly appreciated.
Is the source code of these Java classes available anywhere online?
