32
votes

I am using the Pillow fork of PIL and keep receiving the error

OSError: cannot identify image file <_io.BytesIO object at 0x103a47468>

when trying to open an image. I am using virtualenv with python 3.4 and no installation of PIL.

I have tried to find a solution to this based on others encountering the same problem, however, those solutions did not work for me. Here is my code:

from PIL import Image
import io

# This portion is part of my test code
byteImg = Image.open("some/location/to/a/file/in/my/directories.png").tobytes()

# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO) # <- Error here

The image exists in the initial opening of the file and it gets converted to bytes. This appears to work for almost everyone else but I can't figure out why it fails for me.

EDIT:

dataBytesIO.seek(0)

does not work as a solution (tried it) since I'm not saving the image via a stream, I'm just instantiating the BytesIO with data, therefore (if I'm thinking of this correctly) seek should already be at 0.

4
I'd recommend moving the solution from the post to its own answer. Just for formatting sakeeshimoniak
@FracturedRetina I agree. Please move the answer to a separe post, so that the community can vote on it and so that you can earn reputation, Elan M.Flimm
It happened to me when accidentally loading a PDF instead of PNG.Bohumir Zamecnik

4 Answers

28
votes

(This solution is from the author himself. I have just moved it here.)

SOLUTION:

# This portion is part of my test code
byteImgIO = io.BytesIO()
byteImg = Image.open("some/location/to/a/file/in/my/directories.png")
byteImg.save(byteImgIO, "PNG")
byteImgIO.seek(0)
byteImg = byteImgIO.read()


# Non test code
dataBytesIO = io.BytesIO(byteImg)
Image.open(dataBytesIO)

The problem was with the way that Image.tobytes()was returning the byte object. It appeared to be invalid data and the 'encoding' couldn't be anything other than raw which still appeared to output wrong data since almost every byte appeared in the format \xff\. However, saving the bytes via BytesIO and using the .read() function to read the entire image gave the correct bytes that when needed later could actually be used.

1
votes

While reading Dicom files the problem might be caused due to Dicom compression. Make sure both gdcm and pydicom are installed.

GDCM is usually the one that's more difficult to install. The latest way to easily install the same is

conda install -U conda-forge gdcm
0
votes

On some cases the same error happens when you are dealing with a Raw Image file such CR2. Example: http://www.rawsamples.ch/raws/canon/g10/RAW_CANON_G10.CR2

when you try to run:

byteImg = Image.open("RAW_CANON_G10.CR2")

You will get this error:

OSError: cannot identify image file 'RAW_CANON_G10.CR2'

So you need to convert the image using rawkit first, here is an example how to do it:

from io import BytesIO
from PIL import Image, ImageFile
import numpy
from rawkit import raw
def convert_cr2_to_jpg(raw_image):
    raw_image_process = raw.Raw(raw_image)
    buffered_image = numpy.array(raw_image_process.to_buffer())
    if raw_image_process.metadata.orientation == 0:
        jpg_image_height = raw_image_process.metadata.height
        jpg_image_width = raw_image_process.metadata.width
    else:
        jpg_image_height = raw_image_process.metadata.width
        jpg_image_width = raw_image_process.metadata.height
    jpg_image = Image.frombytes('RGB', (jpg_image_width, jpg_image_height), buffered_image)
    return jpg_image

byteImg = convert_cr2_to_jpg("RAW_CANON_G10.CR2")

Code credit if for mateusz-michalik on GitHub (https://github.com/mateusz-michalik/cr2-to-jpg/blob/master/cr2-to-jpg.py)

0
votes

image = Image.open(io.BytesIO(decoded)) # File "C:\Users\14088\anaconda3\envs\tensorflow\lib\site-packages\PIL\Image.py", line 2968, in open # "cannot identify image file %r" % (filename if filename else fp) # PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x000002B733BB11C8>

=== I fixed as worked: message = request.get_json(force=True)

encoded = message['image']
# https://stackguides.com/questions/26070547/decoding-base64-from-post-to-use-in-pil
#image_data = re.sub('^data:image/.+;base64,', '', message['image'])
image_data = re.sub('^data:image/.+;base64,', '', encoded)
# Remove extra "data:image/...'base64" is Very important
# If "data:image/...'base64" is not remove, the following line generate an error message: 
# File "C:\Work\SVU\950_SVU_DL_TF\sec07_TF_Flask06_09\32_KerasFlask06_VisualD3\32_predict_app.py", line 69, in predict
# image = Image.open(io.BytesIO(decoded))
# File "C:\Users\14088\anaconda3\envs\tensorflow\lib\site-packages\PIL\Image.py", line 2968, in open
# "cannot identify image file %r" % (filename if filename else fp)
# PIL.UnidentifiedImageError: cannot identify image file <_io.BytesIO object at 0x000002B733BB11C8>
# image = Image.open(BytesIO(base64.b64decode(image_data)))
decoded = base64.b64decode(image_data)
image = Image.open(io.BytesIO(decoded))
# return json.dumps({'result': 'success'}), 200, {'ContentType': 'application/json'}
#print('@app.route => image:')
#print()
processed_image = preprocess_image(image, target_size=(224, 224))

prediction = model.predict(processed_image).tolist()
#print('prediction:', prediction)
response = {
    'prediction': {
        'dog': prediction[0][0],
        'cat': prediction[0][1]
    }
}
print('response:', response)
return jsonify(response)