4
votes

I am using Google Colaboratory, and mounting Google Drive. When I access a csv file, it gets me the following error:

OSError: [Errno 5] Input/output error.

This did not happen before.

How can I access to the csv file as I used to?

I have tried this, but did not work:

Input/output error while using google colab with google drive

This happened after conducting the following code.

for segment_id in tqdm(range(segment_num)):
  with h5py.File(os.path.join(INPUT_PATH, "train.h5"), "r") as f:
    train_answers.append(f['time_to_failure'][segment_id*segment_interval + SEGMENT_LENGTH])

The tqdm bar progressed until 37%, and than gave the following error.

OSError: Unable to open file (file read failed: time = Thu May 2 14:14:09 2019 , filename = './drive/My Drive/Kaggle/LANL-Earthquake-Prediction/input/train.h5', file descriptor = 74, errno = 5, error message = 'Input/output error', buf = 0x7ffc31926d00, total read size = 8, bytes this sub-read = 8, bytes actually read = 18446744073709551615, offset = 0)

Since then, large files like train.csv(9GB), which is on Google Drive cannot be read from Google Colaboratory. It gives the following error.

OSError: [Errno 5] Input/output error

Does anyone have a same problem?

Does anyone know how to solve this?

2
Was this ever solved? I have the same issue. Yesterday it was working fine reading files, today it failed. I tried to buy colab pro and it still did nothing.Charles Curt
@CharlesCurt try to work with temporary copy of your files on Colab VM - unzip archive with !unzip command (see my hint bellow). I also had sent some money to google but it hasn't been the right solution :)Lukas

2 Answers

0
votes

There are quota set by Google which are not necessary shown while using Colab. I have run in the same problem. Basically, once the limit is passed you get the [Errno 5] Input/output error independent on the file or the operation you were doing.

The problem seems to be solved since I asked to increase the quota regarding storage (limited to 1 TB total per we). You access the quota page by visiting this page and clicking on quota: https://cloud.google.com/docs/quota

If you don't ask to increase the quota, you might have to wait for 7-14 days until your usage is set back to 0 and can use the full quota.

I hope this helps!

0
votes

I've encounter the same error (during too intensive testing of transfer learning). According to Google the reason may be in too many I/O operations with small files or due to shared and more intensively used resources - every reason related to usage of Google drive. Mostly after 1 day the quota should be refreshed.

You may also try another solution (for impatient users like me) - copy your resources (in my case a zipped folder data containing folders train and validation with images) as a zip file to your Google drive and then unzip it directly into Colab VM by use of:

!unzip -qq '/content/grive/My Drive/CNN/Datafiles/data.zip'  

You can then access the data from folder /content/data/... (and say Goodbye to the I/O Error ;) )