Pandas won't recognize date while reading csv

Question

I'm working on a script which reads in a .csv file with pandas and fills in a specific form. One column in the .csv file is a birthday-column.

While reading the .csv I parse it with 'parse_dates' to get a datetime object so i can format it for my needs:

df = pd.read_csv('readfile1.csv',sep=';', parse_dates=['birthday'])

While it works perfectly with readfile1.csv, it won't work with readfile2.csv. But these files look exactly the same.

The error i get makes me think that the automatic parsing to datetime through pandas is not working:

print(df.at[i,'birthday'].strftime("%d%m%Y"))
AttributeError: 'str' object has no attribute 'strftime'

In both cases the format of the birthday looks like:

'1965-05-16T12:00:00.000Z' #from readfile1.csv
'1934-04-06T11:00:00.000Z' #from readfile2.csv

I can't figure out what's wrong. I checked the encoding of the files and both are 'UTF-8'. Any ideas?

Thank you! Greetings

Dtype is 'object' for readfile2. readfile1: datetime64[ns, UTC] — Tomahawk44
if you do not set keyword parse_dates, and convert the column after reading the csv, with pd.to_datetime and keyword errors='coerce', what result do you get? does the column have NaT values? — MrFuppes
I just did exactly that. The problem was a faulty date. 1077-11-19T12:00:00.000Z which caused: File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1077-11-19 12:00:00 — Tomahawk44
great :) If you have more than one faulty timestamp, the method I described can be helpful as well since you can easily find all cells (string col where the datetime col is NaT). — MrFuppes

Tomahawk44 Tomahawk44 · Accepted Answer · 2021-02-05T10:53:51

if you do not set keyword parse_dates, and convert the column after reading the csv, with pd.to_datetime and keyword errors='coerce', what result do you get? does the column have NaT values? – MrFuppes 32 mins ago

MrFuppes comment on calling pd.to_datetime led to success. One faulty date in the column was the cause of the error. Also Lumber Jacks's hint was helpful to determine the datatypes!

Pandas won't recognize date while reading csv

1 Answers