0
votes

I am running a sqoop job on google cloud to import data from psql database after applying joins on 3 tables. However, sqoop job is getting failed with the below mentioned error, though it imports the data from the table within 15min but gets failed after running for 2hrs. I am able to extract other tables easily.

Data Size: 13GB

Query: gcloud compute ssh $INSTANCE_NAME --project=$PROJECT_ID --service-account=$ACCOUNT --command="""$SQOOP_HOME/bin/sqoop-import -D mapreduce.output.basename='$TABLE_EXPORT' --connect jdbc:$JDBC://$HOST_NAME:$PORT/$DATABASE --username $USERNAME --password '$PASSWORD' --target-dir $BUCKET_STORAGE -m $NUM_WORKERS --split-by $SPLIT_BY --query '$QUERY \$CONDITIONS ' --map-column-java $MAPPING_COLUMNS --fields-terminated-by '|' --null-string '' """ --zone=$ZONE 2>&1

Error:

20/06/12 22:14:36 INFO mapreduce.Job: map 0% reduce 0%

20/06/12 22:14:49 INFO mapreduce.Job: map 50% reduce 0%

20/06/12 22:14:50 INFO mapreduce.Job: map 75% reduce 0%

packet_write_wait: Connection to XX.XX.XXX.XXX port XX: Broken pipe

ERROR: (gcloud.compute.ssh) [/usr/bin/ssh] exited with return code [255].

Command exited with return code 1

1

1 Answers

2
votes

The packet_write_wait: Connection to XX.XX.XXX.XXX port XX: Broken pipe error is usually an indication that the connection was ended because it has been idle for a while, which would make sense considering your command fails after 2 hours.

To remedy that, it was recommended in another Stack post to maintain the connection by configuring ServerAliveInternal and ServerAliveCountMax and I recomemend that you read what they are used for.

Essentially, ServerAliveInternal set a time value that when reached, will have ssh send a message to request a message from the server, while ServerAliveCountMax sets the number of ServerAliveInternal messages that can be sent without receiving a resopnse back from the server before terminating the connection.

You would need to configure this in your client's ~/.ssh/config file, and you can check this Stack post for reference or this thread with the same issue.