My project is in Python 3.6.2. I'm trying to identify whether images are worth downloading at all (if they have a certain aspect ratio) by reading only the header (first ~100 bytes of the online file), so far just testing with imghdr and Pillow.
Image.open fails at the end with:
File "C:\Program Files (x86)\Python36-32\lib\site-packages\PIL\Image.py", line 2349, in open % (filename if filename else fp))
OSError: cannot identify image file <_io.BytesIO object at 0x02F41960>
I found Release notes 2.8.0 for Pillow which seemed to suggest I'd be able to use Image.open(requests.raw). I guessed I should be able to reuse the already-downloaded header after ensuring I reset it with seek(0).
Other answers with this error seem to deal with saving the image buffer to an actual file, which I am trying to avoid (just reusing the downloaded bytes from response.raw for all my test/checks, and not making multiple download requests to any server.)
Where am I going wrong please?
Here is my sample code:
import requests
from PIL import Image
import imghdr
import io
if __name__ == '__main__':
url = "https://ichef-1.bbci.co.uk/news/660/cpsprodpb/37B5/production/_89716241_thinkstockphotos-523060154.jpg"
try:
response = requests.get(url, stream=True)
if response.status_code == 200:
response.raw.decode_content = True
# Grab first 100 bytes as potential image header
header = response.raw.read(100)
ext = imghdr.what(None, h=header)
print("Found: " + ext)
if ext != None: # Proceed to other tests if we received an image at all
header = io.BytesIO(header)
header.seek(0)
im = Image.open(header)
im.verify()
# other image-related tasks here
else:
print("Received error " + str(response.status.code))
except requests.ConnectionError as e:
print(e)