2
votes

I have an online link which is updated with a zip file everyday. The zip file contains a folder and within that the xls file I want to read into pandas

I tried using zipfile module.

zf = zipfile.ZipFile('http://xxxxx/xxxx/xxxxx/xxxxx.zip')

But it gave an error:

IOError: [Errno 22] invalid mode ('rb') or filename: 'http://xxxxx/xxxx/xxxxx/xxxxx.zip'

Also only read csv seems to have compression attribute

How do I achieve this?

1

1 Answers

0
votes

You can use urllib and io:

import zipfile
from urllib.request import urlopen
# from urllib import urlopen  # for python 2

import io

zipfile.ZipFile(io.BytesIO(urlopen(url).read()))

As another option you can pass compression='gzip' argument into pd.read_csv method.