13
votes

When I use google colab I get this error more than once and randomly Sometimes it works and sometimes not

OSError: [Errno 5] Input/output error

is this error occur when I interface with google drive ? any solutions for this bug

4

4 Answers

9
votes

From the FAQ --

Google Drive operations can time out when the number of files or subfolders in a folder grows too large. If thousands of items are directly contained in the top-level "My Drive" folder then mounting the drive will likely time out. Repeated attempts may eventually succeed as failed attempts cache partial state locally before timing out. If you encounter this problem, try moving files and folders directly contained in "My Drive" into sub-folders. A similar problem can occur when reading from other folders after a successfuldrive.mount(). Accessing items in any folder containing many items can cause errors like OSError: [Errno 5] Input/output error (python 3) or IOError: [Errno 5] Input/output error (python 2). Again, you can fix this problem by moving directly contained items into sub-folders.

2
votes

I ran into this error while using os.listdir on a google drive folder that had over 5.5k files in it, and a little window in the bottom left corner of my colab notebook popped up saying a timeout had occurred.

Because I have Colab Pro, I tried switching my runtime Hardware Accelerator to a GPU and Runtime Shape to High-Ram. This fixed the problem for me. It might have been one or both of those options together, not sure.

The problem with the top answer is that you might need some simple functionality in Colab (like os.listdir) in order to efficiently move files and create sub-folders to achieve reduced folder contents. If you can't even list what's in a folder without a timeout error occurring, you may just need to upgrade to Colab Pro to gain those advanced runtime options for a more powerful computing environment.

0
votes

Another possible solution would be to save your files in a different (new folder) directory. I think @bob-smith's solution is one the best solutions for this problem, I am just showing a variation of the original solution which worked for me.

0
votes

I face it almost regularly along with a dialogue-

A Google Drive timeout has occurred (most recently at 12:46:20 PM). More info.

Sometimes if I run a code cell three times, the error doesn't occur anymore; sometimes I have to run the cell as much as 8-9 times to successfully execute it.

The problem always happens during the data loading, as expected. In the data loading cell, I usually have splitting, item transforms, and batch transforms defined. So they add extra time cost when running the cell multiple times.

What I do instead of running the data loading cell multiple times, I run an ls command on Bash using the ! method in a different cell. I usually look for a file (with a known file name) in the training directory and pass that pattern to a grep piped to the ls. Like so-

! ls /content/path/to/training/dr/ | grep xyz_001 # I *know* xyz_001 exists in a filename

If this cell executes successfully after n number of tries, and shows the desired filename in the output, the data loading cell runs successfully 100% of the time, and you can begin the training.

It is important to note that I am not running a ls with the entire training directory without a grep because that will always fail as my training directory sometimes have about 100k files.

This is an ugly hack but it works every time.