1
votes

I am new to twitter api and I have spend tremendous amount of time trying to figure this out.

I would like to extract a large number (100k - 1m) of tweets for a given search term from most recent tweets. I tried working with tweepy and I was able to setup a stream but I need the data from past as well.

I also tried following code but it only gives me 100 at a time and I don't understand how to use since_id and max_id to run through past tweets. Also if someone knows how to extract hashtags from a post. Currently I am splitting the words in posts and finding words with "#" but api.search has an attribute 'hash' and I am not sure how to call it.

results = api.search(q=movies[0],count=100,lang='en')

Any guidance would be appreciated.

3
You simply can't with Twitter API. Any given query is limited to return up to 3200 tweets (in chunks of up to 100 items). - alko
Even if I open a stream and let it sit? Is there a limit on how much I can stream? Also, are there ways around it through some other methods? - Sohail
nope, stream is unlimited, but if you're talking about "but I need the data from past as well", then search is your only option - alko
The tweets returned by the API should have an "entities" metadata listing the hashtags from the tweets. You also can use libraries to extract hashtags, like github.com/ianozsvald/twitter-text-python - Mehdi

3 Answers

1
votes

You can add this to result[] by doing:

results = []
#Get the first 1000 items based on the search query and store it
for tweet in tweepy.Cursor(api.search, q='%23Trump').items(1000):
    results.append(tweet)
0
votes

You will want to use a Tweepy Cursor. To create a Cursor, pass it the api method, and any parameters:

cursor = tweepy.Cursor(api.search, q=movies[0], count=100, lang='en')

Then, iterate over the results returned by the Cursor's items method. You can pass in an optional limit of results:

for item in cursor.items(limit=20): # the limit can be omitted  
    # do something with the item
0
votes

Total archive is limited to 3200 tweets but there is a Daily limit of 1500.