0
votes

Hello there, I get the following error when running the following script in Python:

import requests

r = requests.get('https://www.instagram.com/p/CJDxE7Yp5Oj/?__a=1')
data = r.json()['graphql']['shortcode_media']

C:\ProgramData\Anaconda3\envs\test\python.exe C:/Users/Solba/PycharmProjects/test/main.py
Traceback (most recent call last):
File "C:/Users/Solba/PycharmProjects/test/main.py", line 4, in
data = r.json()
File "C:\ProgramData\Anaconda3\envs\test\lib\site-packages\requests\models.py", line 900, in json
return complexjson.loads(self.text, **kwargs)
File "C:\ProgramData\Anaconda3\envs\test\lib\json_init_.py", line 357, in loads
return _default_decoder.decode(s)
File "C:\ProgramData\Anaconda3\envs\test\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "C:\ProgramData\Anaconda3\envs\test\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Process finished with exit code 1


Python version: 3.9
PyCharm version: 2020.3.1
Anaconda version: 1.10.0


Please help. Thank u.

2

2 Answers

1
votes

r.json() expects a JSON string to be returned by the API. The API should explicitly say it is responding with JSON through response headers.

In this case, the URL you are requesting is either not responding with a proper JSON or not explicitly saying it is responding with a JSON.

You can first check the response sent by the URL by:

data = r.text
print(data)

If the response can be treated as a JSON string, then you can process it with:

import json
data = json.loads(r.text)

Note: You can also check the content-type and Accept headers to ensure the request and response are in the required datatype

0
votes

The reason is because the response is not returning JSON, but instead a whole HTML page. Try r.text instead of r.json()..., and then do whatever you want from there.

If you are not sure the type of content it returns:

h = requests.head('https://www.instagram.com/p/CJDxE7Yp5Oj/?__a=1')
header = h.headers
contentType = header.get('content-type')
print(contentType)

Based on your URL, it returns text/html.

Alternatively, you can try to add a User-Agent in your request - this is to emulate the request to make it look like it comes from a browser, and not a script.

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/46.0.2490.80'
}

r = requests.get('https://www.instagram.com/p/CJDxE7Yp5Oj/?__a=1', headers=headers)
data = r.json()