following sequence of commands works (I lose the first line of the data -no header=None present-, but at least it loads):
df = pd.read_csv(filename,
usecols=range(0, 42))
df.columns = ['YR', 'MO', 'DAY', 'HR', 'MIN', 'SEC', 'HUND',
'ERROR', 'RECTYPE', 'LANE', 'SPEED', 'CLASS',
'LENGTH', 'GVW', 'ESAL', 'W1', 'S1', 'W2', 'S2',
'W3', 'S3', 'W4', 'S4', 'W5', 'S5', 'W6', 'S6',
'W7', 'S7', 'W8', 'S8', 'W9', 'S9', 'W10', 'S10',
'W11', 'S11', 'W12', 'S12', 'W13', 'S13', 'W14']
Following does NOT work:
df = pd.read_csv(filename,
names=['YR', 'MO', 'DAY', 'HR', 'MIN', 'SEC', 'HUND',
'ERROR', 'RECTYPE', 'LANE', 'SPEED', 'CLASS',
'LENGTH', 'GVW', 'ESAL', 'W1', 'S1', 'W2', 'S2',
'W3', 'S3', 'W4', 'S4', 'W5', 'S5', 'W6', 'S6',
'W7', 'S7', 'W8', 'S8', 'W9', 'S9', 'W10', 'S10',
'W11', 'S11', 'W12', 'S12', 'W13', 'S13', 'W14'],
usecols=range(0, 42))
CParserError: Error tokenizing data. C error: Expected 53 fields in line 1605634, saw 54
Following does NOT work:
df = pd.read_csv(filename,
header=None)
CParserError: Error tokenizing data. C error: Expected 53 fields in line 1605634, saw 54
Hence, in your problem you have to pass usecols=range(0, 2)
pandas.to_csv()
, it MIGHT be because there is a '\r' in a column names, in which case to_csv() will actually write the subsequent column names into the first column of the data frame, causing a difference between the number of columns in the first X rows. This difference is one cause of the C error. – user0pd.read_csv("<path>", sep=";")
. Do not use Excel for checking as it sometimes puts the data into columns by default and therefore removes the separator. – Julian