I am working with apache flink and spark and a twitter conntector (flink-connector-twitter_2.12 and spark-streaming-twitter from apache.bahir) to receive real time tweets and predict them through a svm.
Flink:
val streamSource: DataStream[String] = strEnv.addSource(new TwitterSource(properties))
...
Spark:
TwitterUtils.createStream(streamingContext, auth)
...
however, both applications are running on a cluster using the mentioned APIs.
My problem is the low input rate from twitter. The spark application has a avg of: 51.98 records/sec which is compared to the real twitter data (6k per second) extremly low.
Question: Is there any way to improve the input rate?
I appreciate any help :) thanks