pandas read_csv raises an exception (error_bad_lines) when encountering lines with too many fields. However, this does not happen when the argument names is specified..
Example a csv file with the format:
1, 2, 3
1, 2, 3
1, 2, 3, 4
read with pd.read_csv(filepath, header=None) correctly raises ParserError: Error tokenizing data. C error: Expected 3 fields in line 3, saw 4 due to the additional column.
However, when 'names' is specified as an argument:
>>> pd.read_csv(filepath, names=['A', 'B', 'C'], header=None)
A B C
0 1 2 3
1 1 2 3
2 1 2 3
there is no error raised and the 'too long/bad' line which should be skipped is included...
Is there a way to specify names and still have the ParserError be raised such that the too long/bad lines can be dropped with error_bad_lines=False?