4
votes

I'd like to play around with Google Cloud Pub/Sub and processing messages in Dataflow. Are there any public data feeds in Pub/Sub that I can use to get started?

In the Dataflow WordCount example, input is read from a file in Cloud Storage, gs://dataflow-samples/shakespeare/kinglear.txt. It seems that dataflow-samples is accessible to all projects, which is very convenient for getting started. Is there anything similar for Pub/Sub?

2

2 Answers

8
votes

Currently, Google maintains this public topic projects/pubsub-public-data/topics/taxirides-realtime as part of a Cloud Dataflow code lab.

You can find more information on how to use it here.

Additionally, you can use Dataflow with BigQuery. Google provides this comprehensive set of public data.

-1
votes

What do you mean public datasets in Cloud Pub/Sub? In Cloud Pub/Sub, you have topics, publishers sending messages to those topics and subscribed consumers receiving messages from those topics. Every topic belongs to a project, so as such it doesn't make sense to have a public topic, if that's what you're asking.