3
votes

I wish to extract value for External URL present in the Tweet. Plus the generated Thumbnail of that URL.

Example Tweet:

http://prntscr.com/ogdqey

https://twitter.com/JarirBookstore/status/1151506848387870720

Output from Twitter statuses/user_timeline API -

{
    "created_at": "Wed Jul 17 15:00:01 +0000 2019",
    "id": 1151506848387870720,
    "id_str": "1151506848387870720",
    "full_text": "عروض #صيف_هواوي على أجهزة التابلت والميت بوك المختلفة \nالعرض ساري الى 21 يوليو",
    "truncated": false,
    "display_text_range": [
        0,
        78
    ],
    "entities": {
        "hashtags": [
            {
                "text": "صيف_هواوي",
                "indices": [
                    5,
                    15
                ]
            }
        ],
        "symbols": [],
        "user_mentions": [],
        "urls": []
    },
    "source": "<a href=\"https:\/\/ads-api.twitter.com\" rel=\"nofollow\">Twitter Ads Composer<\/a>",
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": 281376243,
        "id_str": "281376243"
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 0,
    "favorite_count": 3,
    "favorited": false,
    "retweeted": false,
    "lang": "ar"
},

URL entity is a blank array. If the link is not present in the Tweet's text itself, API doesn't return it in the URL entity. I've tried with and without tweet_mode=extended

Surprisingly, Twitter does return URL for few such Tweets. One example is below.

https://twitter.com/BillGates/status/1150605518291001345

API Response:

{
    "created_at": "Mon Jul 15 03:18:27 +0000 2019",
    "id": 1150605518291001345,
    "id_str": "1150605518291001345",
    "full_text": "I recently wrote about how people with tech skills can find fascinating problems to work on in global health and development. I was excited to come across this @techreview article about African machine learning researchers who are already doing just that. https:\/\/t.co\/3e1d2QvvH4",
    "truncated": false,
    "display_text_range": [
        0,
        279
    ],
    "entities": {
        "hashtags": [],
        "symbols": [],
        "user_mentions": [
            {
                "screen_name": "techreview",
                "name": "MIT Technology Review",
                "id": 15808647,
                "id_str": "15808647",
                "indices": [
                    160,
                    171
                ]
            }
        ],
        "urls": [
            {
                "url": "https:\/\/t.co\/3e1d2QvvH4",
                "expanded_url": "https:\/\/b-gat.es\/2xMsbdh",
                "display_url": "b-gat.es\/2xMsbdh",
                "indices": [
                    256,
                    279
                ]
            }
        ]
    },
    "source": "<a href=\"https:\/\/www.sprinklr.com\" rel=\"nofollow\">Sprinklr<\/a>",
    "in_reply_to_status_id": null,
    "in_reply_to_status_id_str": null,
    "in_reply_to_user_id": null,
    "in_reply_to_user_id_str": null,
    "in_reply_to_screen_name": null,
    "user": {
        "id": 50393960,
        "id_str": "50393960"
    },
    "geo": null,
    "coordinates": null,
    "place": null,
    "contributors": null,
    "is_quote_status": false,
    "retweet_count": 1320,
    "favorite_count": 6719,
    "favorited": false,
    "retweeted": false,
    "possibly_sensitive": false,
    "lang": "en"
},

Why the response is random? It does return URL for Bill Gates' Tweet but not for the one mentioned earlier in my question.

How can I have both External URL and the Thumbnail displayed by Twitter?

My final API call -

https://api.twitter.com/1.1/statuses/user_timeline.json?screen_name=jarirbookstore&count=100&exclude_replies=true&trim_user=true&include_rts=false&tweet_mode=extended

1

1 Answers

2
votes

The first of the two examples you provide is posted by the Twitter Ads platform, so the attached card is not part of the Tweet. There is no way to get that via the API. In the second case, the URL is part of the Tweet text, so it is also part of the URL entities object.