2
votes

I have AWS s3 bucket where I am receiving multiple parquet files every minutes after performing some operation in AWS firehose. Now I have to make Real time sync of these files with GCP cloud storage bucket as we have multi cloud env and further process will be happening in GCP cloud. But I have problem that how can I do real time sync between two cloud buckets so that as soon as any file comes to AWS s3, same time it should come to GCP bucket as well. Any inputs please

1
I'm not aware of any dedicated aws tool for that. But one way would be to enable notifications on your bucket to trigger lambda for new uploads. The lambda would then copy the objects to gcp. - Marcin
You are using the term realtime. Firehose does not support Google Cloud Storage, so the answer is automatically NO. If you do not need realtime behavior, you can implement an event system, such as Lambda, that copies objects from S3 to Cloud Storage. - John Hanley

1 Answers

1
votes

If you literally mean updates happen at S3 and GCS atomically, that's not possible. The best you could do is have a job that gets notifications when updates complete at one, and initiate a copy to the other. You'd need to put some work into making the job robust regarding transient failures.