0
votes

My requirement to batch process/streaming files through pubsub into google cloud storage using python scripts.

I have used below python files and able to see the messages published from topic to subscription ,now I want to upload these individual message into one file and need to load into cloud storage .

Can you please suggest where we can change the code in below scripts to load message data into cloud storage as files(batching individual messages)

Below path we have python scripts : python-docs-samples/pubsub/cloud-client

subscriber.py
publisher.py

Other question : Is it possible to stream the files through pubsub and load them into cloud storage .

Thanks

1

1 Answers

1
votes

AFAIK, streaming loading files to GCS is not available, at least no build-in function for that. Google pubsub to Google cloud storage

Using Dataflow's TextIO.Write can write Pub/Sub messages to GCS. However, streaming / unbounded collection is not supported either. Streaming data to Google Cloud Storage from PubSub using Cloud Dataflow