2
votes

I have a requirement where I need the output of Google Dataproc in a text file.

For example, I have below query:

gcloud dataproc jobs submit hive --cluster=$CLUSTER --region=$REGION \
    --execute="select count(*) from db.table;"

I just need the information of record count in the flat file.

I am using something like below (redirection operator). However, it gives me the entire data which gets printed in console.

gcloud dataproc jobs submit hive --cluster=$CLUSTER --region=$REGION \
    --execute="select count(*) from db.table;" > text.csv

Desired output for me would be:

724

where 724 is the total number of records in my table.

Workaround Solution:

gcloud dataproc jobs submit hive --cluster=$CLUSTER --region=$REGION \ --execute="select count(*) from db.table;" &> text.csv

Use "&" before redirection. It will redirect the whole output to test.csv file.

1

1 Answers

2
votes

This is not supported by the Dataproc Jobs API. You probably will have to do some output parsing before the redirecting. I filed a feature request to separate stdout and stderr in the Jobs API output. Thanks for the feedback.