How to know line and col when the read_csv method of pandas thows exception

Question

I'm try to import huge csv files into pandas Dataframe (200 cols and millions of lines).

I'm using the read_csv method which I give a dtypes dictionary in parameter in order to accelerate import.

I've got some exceptions about wrong format I give thought dtype like this :

ValueError: invalid literal for long() with base 10: ''

But there no reference to the line number or to the col. My Files are huge, this information will help me to save lot of time to find what's wrong in my dtypes structure.

Any idea ?

Edit :

To be more precise, I'm going to explain all the story. First I tried to read my csv file which this command line :

t = pd.read_csv(filename, sep=",")

It give me this error message :

C:\Python27\lib\site-packages\pandas\io\parsers.py:1159: DtypeWarning: Columns (0) have mixed types. Specify dtype option on import or set low_memory=False.

So I try to specify my dtype through this way (I'm not copy/paste the full dtype because there are 207 cols) :

dtype_file = {
  'a': pd.np.int16,
  'b': pd.np.int16,
...
}
pd.read_csv(filename, sep=",",dtypes=dtype_file, na_filter=False)

Jérôme B Jérôme B · Accepted Answer · 2015-02-24T17:14:48

In fact, I resolve it by myself using the low_memory parameter :

pd.read_csv(filename, sep=",", na_filter=False, low_memory=False)

How to know line and col when the read_csv method of pandas thows exception

2 Answers