0
votes

I am storing in a database, every 30 minutes, Twitter's trending topics of a country Y. No problem with that. Now, I want to get as much tweets as possible matching those trending topics for research purposes.

Since I would like to study the patterns of the trends, I would like continuous tweet data of at least 3 days centered in the day the trend peak was detected, for every trending topic. In order to achieve that, I thought of doing the following:

Suppose I am in day X. I could retrieve the unique trends of day X-2, and for every trend, look for tweets matching the trend in the interval [X-3, X-1], that is 3 days. However, the problem here is Twitter rate limitations. If I have 100 trending topics in day X-2, and I make 20 GET search requests/trend, I would end up doing a total of 2,000 requests, which overpasses Twitter's 350 hourly rate limit. If make 300 req/hour, it would take more than 6 hours to get the data for only one day...

Does anybody know any other (better) way for getting tweets associated with trends?

Thanks in advance

1

1 Answers

1
votes

Twitter Streaming API?

Twitter Streaming API doesn't deliver any past tweets. You only receive tweets starting from the time the server connection is established. The search API will return tweets matching the current query up to 7 days old in theory, but that is entirely up to Twitter’s current load. (Note*-At times this interval has been as short as 24 hours. In addition, you are limited by the ability to only receive up to 1,500 tweets regardless of how old they are.)

Is there any way to get more tweets from the streaming?

None that I know. But, do refer the below mentioned information if you are considering to switch among search or streaming API.

Please choose your case:

  • If you need real time data and your number of requests are high:

Go for Streaming API

The streaming API requires that you keep the connection active. This requires a server process with an infinite loop, to get the latest tweets.

Advantage

1)Lag in retrieving results: Tweets delivered with this method are basically real-time, with a lag of a second or two at most between the time the tweet is posted and it is received from the API

2)Not rate limited.

  • If you need aggregate data regardless of its time range and your number of requests are not high:

Go for Search API

The search API is the easier of the two methods to implement but it is rate limited .Each request will return up to 100 tweets, and you can use a page parameter to request up to 15 pages, giving you a theoretical maximum of 1,500 tweets for a single query.

Advantage

1)Finding tweets in the past:The search API wins by default in this area, because the streaming API doesn’t deliver any past tweets

2)Easier to implement