0
votes

I am trying to use bq cli to export data using big Query to GCS. Currently there are two projects and each project is having its own service account. I have authenticated the services account using gcloud auth active-service-account by passing the key json file. While running my jobs i am explicitly setting the project and account using the below commands

Within JOB1

gcloud config set account account1

gcloud config set project project1

bq extract --destination_format NEWLINE_DELIMITED_JSON table1 gs://path1

Within JOB2

gcloud config set account account2

gcloud config set project project2

bq extract --destination_format NEWLINE_DELIMITED_JSON table2 gs://path2

When both job1 and job2 are running in parallel JOB1 is failing with the error account2 not having access to project1 and similarly in some instances JOB2 is failing with the error account1 is not having access to project2. We have identified that its happening because when we set the account its changing the default account within the server(and not within the session) and therefore the other job which is running in parallel failing. Can you please help on how we can execute bq commands using multiple service accounts in parallel within the same server

1
Why do you use service account key and not your personal account? Why it's important to perform the extract in parallel?guillaume blaquiere
Have you added both service account A to project B and vice versa with service account B to project A, while also assigning the correct roles. If the service account is not assigned the correct roles in each separate project it could result in denial of access. Can you also supply the output of the error message.Gustavo
@Guillaume Blaquiere: We use service account so that even if I move out of the project the job continues to run. We perform extract in parallel as an hourly jobs in two seperate jobs as its required by our downstreamBerry Jenson
@Gustavo: We can't use the service account of project a to access data in project b as the owners are different for both the projects and they are not giving permissionBerry Jenson
When you say "it fails", do you mean, no file is exported? Or you have an error on your bash console but the operation end correction on GCP side (files are exported in bucket)?guillaume blaquiere

1 Answers

0
votes

Let me explain the process before going deeper into the solution. When you perform an operation on BigQuery, most of time it's asynchronous operation. The cli let you think it's synchronous but not.

The cli perform this

  • Launch the job and get the jobId
  • Loop pull the jobId status regularly
  • Print the job result at the end (state Done, Error)

If you change your credentials, the Loop pull can't be done by the CLI because it's no longer authorize to check this JobId on the project.

Now, you have 2 solutions:

  • The job result is important for you in the cli, you can't change the credential. The extract must be sequential
  • The job result is not import, you prefer the parallelism, you can use the --nosynchronous_mode param when you perform your extract

Note: I'm quite sure that if you use multiple users and SUDO commande, you can achieve what you want in sync mode. However, I'm not linux expert and I can't help on this.