I am parsing an Apache log file and saving it into pandas data frame for my further investigation.
But in the log file I have some bad lines and so the following error occurs:
ValueError: Expected 11 fields in line 4320, saw 27
To overcome this issue, I included error_bad_lines = False
while reading the file. This doesn't help as I am getting the following error:
ValueError: The 'error_bad_lines' option is not supported with the 'python' engine
Note : I am explicitly using python engine
as I have separator as a regular expression.
Code snippet:
data = pd.read_csv(
log_file,
sep=r'\s(?=(?:[^"]*"[^"]*")*[^"]*$)(?![^\[]*\])',
engine='python',
na_values='-',
header=None,
usecols = use_cols,
skiprows =1,
converters={time_taken_index[0]:parse_sec, time_index[0]:parse_datetime, req_index[0]:parse_str,status_index[0]:parse_str},
error_bad_lines = False
)
I'd be grateful for any suggestions. Thank you.