0
votes

I am working on an IOT application where I need to read streaming data from a PubSub topic. I want to read this data using Google Cloud Dataflow SDK. I am using Java 1.8

I am using the trial version of Google Cloud Platform. When I was using PubSubIO.Read method to read the streaming data, I was getting errors in the log file that my project does not have enough CPU quota to run the app.

So I want to read the streaming data using Google Cloud Dataflow SDK.

Can someone please let me know where can I find complete examples of reading the streaming data using Google Cloud Dataflow SDK.

Thanks in advance.

1
PubsubIO.Read is the correct way. Please post the errors you were getting, and ideally your job ID. - jkff
(bdaaec3d0f58e13): Workflow failed. Causes: (50ffe37e5aed6b7): Project xxxxxxxx has insufficient quota(s) to execute this workflow with 3 instances in region us-central1. Quota summary (available/required): 80/3 instances, 8/12 CPUs, 2048/1260 disk GB, 1024/0 SSD disk GB, 100/1 instance groups, 50/1 managed instance groups, 100/1 instance templates, 23/3 in-use IP addresses. Please see cloud.google.com/compute/docs/resource-quotas about requesting more quota. - jumping jack
Please find the error message from the logs above. Kindly let me know your thoughts. - jumping jack

1 Answers

2
votes

A number of complete examples are available in Cloud Dataflow documentation under Complete Examples and, linked from there, on Github, also under Complete Examples.

According to your error message, you indeed have not enough CPU quota to run the default 3x 4-threaded (n1-standard-4) workers. The CPU quota for Google Cloud trial is 8 CPUs.

You can configure your job to require fewer CPUs, e.g. by using fewer workers (e.g. --numWorkers=1) or a different machine type (--workerMachineType=n1-standard-1)