1
votes

I am working on a pet project based on multi-cloud (AWS and GCP) which is based on serverless architecture.

Now there are files generated by the business logic within GCP (using Cloud Functions and Pub/Sub) and they are stored in GCP Cloud storage. I want to ingest these files dynamically to AWS S3 bucket from the Cloud Storage.

One possible way is by using the gsutil library (Exporting data from Google Cloud Storage to Amazon S3) but this would require a compute instance, and run the gsutil commands manually which I want to avoid.

1

1 Answers

2
votes

In answering this I'm reminded a bit of a Rube Goldberg type setup but I don't think this is too bad.

From the Google side you would create a Cloud Function that is notified when a new file is created. You would use the Object Finalize event. This function would get the information about the file and then call an AWS Lambda fronted by AWS API Gateway.

The GCP Function would pass the bucket and file information to the AWS Lambda. On the AWS side you would have your GCP credentials and the GCP API download the file and upload it to S3.

Something like: enter image description here

All serverless on both GCP and AWS. Testing isn't bad as you can keep them separate - make sure that GCP is sending what you want and make sure that AWS is parsing and doing the correct thing. There is likely some authentication that needs to happen from the GCP cloud function to API gateway. Additionally, the API gateway can be eliminated if you're ok pulling AWS client libraries into the GCP function. Since you've got to pull GCP libraries into the AWS Lambda this shouldn't be much of a problem.