Converting a supposed excel file in csv in python

Question

I am having an issue trying to use a code for converting a file into csv. I am using the code below as a start

directory = 'C:\OI Data'

filename = 'OpenInterest08-24-16'

data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1',    index_col=None)
    data_xls.to_csv(os.path.join(directory,filename +'.csv'), encoding='utf-8')

and I am getting the following error:

Traceback (most recent call last):

File "", line 1, in

File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 714, in runfile execfile(filename, namespace)

File "C:\Anaconda2\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 74, in execfile exec(compile(scripttext, filename, 'exec'), glob, loc)

File "C:/Users/Public/Documents/Python Scripts/work.py", line 26, in data_xls = pd.read_excel(os.path.join(directory,filename), 'Sheet1', index_col=None)

File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 170, in read_excel io = ExcelFile(io, engine=engine)

File "C:\Anaconda2\lib\site-packages\pandas\io\excel.py", line 227, in init self.book = xlrd.open_workbook(io)

File "C:\Anaconda2\lib\site-packages\xlrd__init__.py", line 441, in open_workbook ragged_rows=ragged_rows,

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 91, in open_workbook_xls biff_version = bk.getbof(XL_WORKBOOK_GLOBALS)

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1230, in getbof bof_error('Expected BOF record; found %r' % self.mem[savpos:savpos+8])

File "C:\Anaconda2\lib\site-packages\xlrd\book.py", line 1224, in bof_error raise XLRDError('Unsupported format, or corrupt file: ' + msg) xlrd.biffh.XLRDError: Unsupported format, or corrupt file: Expected BOF record; found '\n\n\n\n\n '

I am struggling to figure out the file format I am using https://www.theice.com/marketdata/reports/icefuturesus/PreliminaryOpenInterest.shtml?futuresExcel=&tradeDate=8%2F24%2F16

opening the file myself I get the following enter image description here

I am still a beginner at python and some help would be much appreciated.

Thanks

Is your file an xls or xlsx file? pandas will choose the appropriate parser depending on the extension and there doesn't appear to be one... — Jon Clements♦
This is problematic: data_xls.to_csv(os.path.join(directory,filename,'.csv'), encoding='utf-8')... — mechanical_meat
Try instead: data_xls.to_csv(os.path.join(directory,filename+'.csv'), encoding='utf-8') — mechanical_meat

mechanical_meat mechanical_meat · Accepted Answer · 2016-09-06T11:41:01

You can start by fixing this part:

data_xls.to_csv(os.path.join(directory,filename,'.csv'), encoding='utf-8')

What happens when you do that is:

'C:\OI Data\\OpenInterest08-24-16\\.csv'

Which is not what you want. Instead do:

os.path.join(directory,filename+'.csv')

Which will give you:

'C:\OI Data\\OpenInterest08-24-16.csv'

Also, this is not a problem here, but in general be careful with this because a single backslash and a character can indicate an escape sequence, e.g. \n is a newline:

directory = 'C:\OI Data'

Instead escape the backslash like so:

directory = 'C:\\OI Data'

Converting a supposed excel file in csv in python

1 Answers