2
votes

I'm working on a script which reads in a .csv file with pandas and fills in a specific form. One column in the .csv file is a birthday-column.

While reading the .csv I parse it with 'parse_dates' to get a datetime object so i can format it for my needs:

df = pd.read_csv('readfile1.csv',sep=';', parse_dates=['birthday'])

While it works perfectly with readfile1.csv, it won't work with readfile2.csv. But these files look exactly the same.

The error i get makes me think that the automatic parsing to datetime through pandas is not working:

print(df.at[i,'birthday'].strftime("%d%m%Y"))
AttributeError: 'str' object has no attribute 'strftime'

In both cases the format of the birthday looks like:

'1965-05-16T12:00:00.000Z' #from readfile1.csv
'1934-04-06T11:00:00.000Z' #from readfile2.csv

I can't figure out what's wrong. I checked the encoding of the files and both are 'UTF-8'. Any ideas?

Thank you! Greetings

1
with a df.info(), what type do you get for column birthday?Lumber Jack
Dtype is 'object' for readfile2. readfile1: datetime64[ns, UTC]Tomahawk44
if you do not set keyword parse_dates, and convert the column after reading the csv, with pd.to_datetime and keyword errors='coerce', what result do you get? does the column have NaT values?MrFuppes
I just did exactly that. The problem was a faulty date. 1077-11-19T12:00:00.000Z which caused: File "pandas\_libs\tslibs\np_datetime.pyx", line 113, in pandas._libs.tslibs.np_datetime.check_dts_bounds pandas._libs.tslibs.np_datetime.OutOfBoundsDatetime: Out of bounds nanosecond timestamp: 1077-11-19 12:00:00 Tomahawk44
great :) If you have more than one faulty timestamp, the method I described can be helpful as well since you can easily find all cells (string col where the datetime col is NaT).MrFuppes

1 Answers

0
votes

if you do not set keyword parse_dates, and convert the column after reading the csv, with pd.to_datetime and keyword errors='coerce', what result do you get? does the column have NaT values? – MrFuppes 32 mins ago

MrFuppes comment on calling pd.to_datetime led to success. One faulty date in the column was the cause of the error. Also Lumber Jacks's hint was helpful to determine the datatypes!