0
votes

We use the twitter user_timeline api to get the last 200 tweets for a set of twitter accounts. I noticed a couple of weird issues

  1. A few tweets arrived to the system hours after their actual creation time. Meaning, a person tweets, an hour later we run the user_timeline api for the user, we don't see the tweet, 8 hours later we run the timeline, and we receive the tweet. Does this mean it might take twitter hours sometime to index a tweet and make available for the timeline api

  2. Sometimes the user statuses_count decreases with every new tweet for a specific account. for example, the first tweet has the statuses_count = 100, then next tweet which was tweeted after the first has statuses_count = 99. Is this because the user deleted some tweets? Is the statuses_count reliable?

Thanks

1

1 Answers

0
votes

The Twitter API is eventually consistent, so I would theorise that for the timelines call, what could be happening is that there is some data center synchronisation going on behind the scenes and that you might be hitting an older copy of the data at the time of the call. It could also be because of some local caching, but it's not clear from the question how you've built your system. In most cases where I've seen an issue like this, that would be my guess as what is going on. If you want to get Tweets in more real-time, that's what the streaming API is optimized for - the REST API works differently.

On the second question, there's again a small chance that this is a consistency issue, or it could indeed be due to Tweet deletion. The different elements of the Tweet object (user object, media info, links etc) are hydrated from different systems, so they may just be momentarily out of sync, or, Tweets may have been deleted.