I would like to collect all tweets that contain on the following words: Bitcoin, Ethereum, Litecoin or Denarius
However, I want to exclude tweets than can be classified as retweets and tweets that contain links. I know from the following website (https://www.followthehashtag.com/help/hidden-twitter-search-operators-extra-power-followthehashtag) that I can add -filter:links to exclude tweets that contain links. This is clearly visible by comparing the following search term;
https://twitter.com/search?f=tweets&vertical=news&q=Bitcoin&src=typd
with https://twitter.com/search?f=tweets&q=Bitcoin%20-filter%3Alinks&src=typd
The same applies for retweets, where I can use -filter:retweets (see https://twitter.com/search?f=tweets&q=Bitcoin%20-filter%3Aretweets&src=typd)
I want to add these criteria to make sure that I reduce the "noise" and be less likely to violate any API-limitations. I wrote the following Python-script:
import sys
import time
import json
import pandas as pd
from tweepy import OAuthHandler
from tweepy import Stream
from tweepy.streaming import StreamListener
USER_KEY = ''
USER_SECRET = ''
ACCESS_TOKEN = ''
ACCESS_SECRET = ''
crypto_tickers = ['bitcoin', 'ethereum', 'litecoin', 'denarius', '-filter:links', '-filter:retweets']
class StdOutListener(StreamListener):
def on_data(self, data):
tweet = json.loads(data)
print(tweet)
def on_error(self, status):
if status == 420:
sys.stderr.write('Enhance Your Calm; The App Is Being Rate Limited For Making Too Many Requests')
return True
else:
sys.stderr.write('Error {}n'.format(status))
return True
if __name__ == "__main__":
listener = StdOutListener()
auth = OAuthHandler(USER_KEY, USER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
stream = Stream(auth, listener)
stream.filter(languages=['en'], track=crypto_tickers)
However, the output clearly shows tweets that are retweets and contain links.
Q1: How can I correctly include the search criteria in my script and get the correct output?
Q2: According to the official documentation the Streaming API allows up to 400 track keywords (https://developer.twitter.com/en/docs/tweets/filter-realtime/overview/statuses-filter.html). Do my two filter criteria classify as 2 track keywords?
Thanks in advance,