1
votes

I have a Dataflow job that is written using Apache Beam. It looks similar to this template, but it saves data from JDBC to Cloud Storage:

https://github.com/GoogleCloudPlatform/DataflowTemplates/blob/master/src/main/java/com/google/cloud/teleport/templates/JdbcToBigQuery.java

My problem was, that everybody could see database credentials in Dataflow UI. So I found article

https://medium.com/google-cloud/using-google-cloud-key-management-service-with-dataflow-templates-71924f0f841f

where community show how to encrypt this data. I did everything like in this article, but my Dataflow job doesn't want to decrypt credentials with KMS key given (when I run it using Cloud Function).

So I tried running it in Cloud Shell

gcloud dataflow jobs run JOB_NAME \
--region=us-west1 \
--gcs-location=TEMPLATE_LOCATION \
--dataflow-kms-key=projects/PROJECT_ID/locations/us-west1/keyRings/KEY_RING/cryptoKeys/KEY_NAME \
--parameters=...,KMSEncryptionKey=projects/PROJECT_ID/locations/us-west1/keyRings/KEY_RING/cryptoKeys/KEY_NAME,...

But I have an error

Error message from worker: java.lang.RuntimeException: org.apache.beam.sdk.util.UserCodeException: com.google.api.gax.rpc.PermissionDeniedException: io.grpc.StatusRuntimeException: PERMISSION_DENIED: Permission 'cloudkms.cryptoKeyVersions.useToDecrypt' denied on resource 'projects/PROJECT_ID/locations/us-west1/keyRings/KEY_RING/cryptoKeys/KEY_NAME' (or it may not exist).

I am completely stuck. Has anyone had the same problem and could help?

1
Have you granted dataflow access to the KMS key?sethvargo
@sethvargo Yes, I granted permission to service-SERVICE_ID@dataflow-service-producer-prod.iam.gserviceaccount.commpj

1 Answers

1
votes

You need to make sure that you have assigned the Cloud KMS CryptoKey Encrypter/Decrypter role to the Dataflow service account and also the Compute Engine service account.

Refer to this document, Cloud Key Management Service (Cloud KMS) encryption key with Dataflow

If using Cloud functions it might also be necessary to assign to the Google Cloud Functions service agent service account, the permissions to encrypt and decrypt using KMS.

Ensure the user that is calling the encrypt and decrypt methods has the clAssign the Cloud KMS CryptoKey Encrypter/Decrypter role to the Dataflow service account.oudkms.cryptoKeyVersions.useToEncrypt and cloudkms.cryptoKeyVersions.useToDecrypt permissions on the key used to encrypt or decrypt.

One way to permit a user to encrypt or decrypt is to add the user to the roles/cloudkms.cryptoKeyEncrypter, roles/cloudkms.cryptoKeyDecrypter, or roles/cloudkms.cryptoKeyEncrypterDecrypter

Also make sure the parameters passed are correct;

PYTHON

python -m apache_beam.examples.wordcount \
  --input gs://dataflow-samples/shakespeare/kinglear.txt \
  --output gs://STORAGE_BUCKET/counts \
  --runner DataflowRunner \
  --project PROJECT_ID \
  --temp_location gs://STORAGE_BUCKET/tmp/ \
  --dataflow_kms_key=KMS_KEY

JAVA

mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount \
  -Dexec.args="--inputFile=gs://dataflow-samples/shakespeare/kinglear.txt \
               --output=gs://STORAGE_BUCKET/counts \
               --runner=DataflowRunner --project=PROJECT_ID \
               --gcpTempLocation=gs://STORAGE_BUCKET/tmp \
               --dataflowKmsKey=KMS_KEY"
  -Pdataflow-runner

Specifying pipeline execution parameters