1
votes

Since a few days ago, I'm no longer able to submit my dataflow jobs, they fail with the error below.

I tried to submit the simple WordCount job and it succeeded. Even with a very simplified version of my own job, everything is fine. But when I add more code (adding GroupByKey transform), I'm no longer able to submit it.

Does anybody have any idea what does this error mean?

Thanks, G

Exception in thread "main" java.lang.RuntimeException: Failed to create a workflow job: Invalid JSON payload received. Unknown token.
       { 8r W
^
    at com.google.cloud.dataflow.sdk.runners.DataflowPipelineRunner.run(DataflowPipelineRunner.java:219)
    at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:96)
    at com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner.run(BlockingDataflowPipelineRunner.java:47)
    at com.google.cloud.dataflow.sdk.Pipeline.run(Pipeline.java:145)
    at snippet.WordCount.main(WordCount.java:165)
Caused by: com.google.api.client.googleapis.json.GoogleJsonResponseException: 400 Bad Request
{
  "code" : 400,
  "errors" : [ {
    "domain" : "global",
    "message" : "Invalid JSON payload received. Unknown token.\n\u001F \b\u0000\u0000\u0000\u0000\u0000\u0000\u0000  \t{ 8r\u0000 W\n^",
    "reason" : "badRequest"
  } ],
  "message" : "Invalid JSON payload received. Unknown token.\n\u001F \b\u0000\u0000\u0000\u0000\u0000\u0000\u0000  \t{ 8r\u0000 W\n^",
  "status" : "INVALID_ARGUMENT"
}
1

1 Answers

1
votes

To debug this issue, we want to validate that the request that is being made is valid and find the invalid portion of the JSON payload. To do this we will:

  1. Increase logging verbosity
  2. Re-run the application and capture the logs
  3. Find the relevant section within the logs representing the JSON payload
  4. Validate the JSON payload

Increasing logging verbosity

By adding the following lines to your main before you construct your pipeline, you will tell the Java logger implementation to increase the verbosity for the "com.google.api" package. This in turn will log the HTTP request/responses to Google APIs.

import java.util.logging.ConsoleHandler;
import java.util.logging.Level;
import java.util.logging.Logger;
public class MyDataflowProgram {
  public static void main(String[] args) {
    ConsoleHandler consoleHandler = new ConsoleHandler();
    consoleHandler.setLevel(Level.ALL);
    Logger googleApiLogger = Logger.getLogger("com.google.api");
    googleApiLogger.setLevel(Level.ALL);
    googleApiLogger.setUseParentHandlers(false);
    googleApiLogger.addHandler(consoleHandler);
    ... Pipeline Construction ...
}

Re-run the application and capture the logs

You will want to re-run your Dataflow application and capture the logs. This is dependent on your development environment, what OS and/or IDE that you use. For example, when using Eclipse the logs will appear within the Console window. Saving these logs will help you maintain a record of the issue.

Find the relevant section within the logs representing the JSON payload

During re-execution of your Dataflow job, you will want to find the logs related to submission of the Dataflow job. These logs will contain the HTTP request followed by a response and will look like the following:

POST https://dataflow.googleapis.com/v1b3/projects/$GCP_PROJECT_NAME/jobs
Accept-Encoding: gzip
... Additional HTTP headers ...
... JSON request payload for creation ...
{"environment":{"clusterManagerApiService":"compute.googleapis.com","dataset":"bigquery.googleapis.com/cloud_dataflow","sdkPipelineOptions": ...

-------------- RESPONSE --------------
HTTP/1.1 200 OK
... Additional HTTP headers ...
... JSON response payload ...

You are interested in the request payload as the error you are getting indicates that it is the source of the problem.

Validate the JSON payload

There are many JSON validators which can be used but I prefer to use http://jsonlint.com/ because of its simplicity. If you are able, please share your findings either by updating the question or if you get stuck, feel free to send me a private message.