0
votes

What is the best way to moving/streaming data out of Google cloud storage? Also, Does Dataflow offer any feature to stream data from Google cloud storage to outside GCP?

1

1 Answers

1
votes

The Best way to move data out of Google cloud storage is perhaps there gsutil tool or you can use the python and boto plugin that they provide for simplicity, You will find the details of this in this link. https://cloud.google.com/storage/docs/streaming. As for the second part of the question, Kafka can be connected to Cloud Dataflow as of now and I think you can use that to stream data out of GCP. Apache Beam supports KafkaIO as of 2016. I guess the below link would helo you a lot.

https://cloud.google.com/blog/big-data/2016/09/apache-kafka-for-gcp-users-connectors-for-pubsub-dataflow-and-bigquery

https://github.com/apache/beam/tree/master/sdks/java/io/kafka