you can convert the df to str using astype and then apply to_datetime with format string:
In [190]:
df.astype(str).apply(lambda x: pd.to_datetime(x, format='%Y%m%d'))
Out[190]:
col1 col2
0 2004-09-29 NaT
1 NaT 2004-09-25
EDIT
using strptime will be slower and less friendly, firstly converting to str introduces .0 as the dtype is float, we have to split on this, additionally strptime doesn't understand Series so we have to call applymap. On top of this NaN will cause strptime to bork so we have to do the following:
In [203]:
def func(x):
try:
return dt.datetime.strptime(x.split('.')[0], '%Y%m%d')
except:
return pd.NaT
df.astype(str).applymap(func)
Out[203]:
col1 col2
0 2004-09-29 NaT
1 NaT 2004-09-25
Timings
If we compare the 2 methods on a 2K row df:
In [212]:
%timeit df.astype(str).apply(lambda x: pd.to_datetime(x, format='%Y%m%d'))
100 loops, best of 3: 8.11 ms per loop
In [213]:
%%timeit
def func(x):
try:
return dt.datetime.strptime(x.split('.')[0], '%Y%m%d')
except:
return pd.NaT
df.astype(str).applymap(func)
10 loops, best of 3: 86.3 ms per loop
We observe that the pandas method is over 10X faster, it's likely that it scales much better as it's vectorised
str()to convert thefloatto astringbefore passing it todatetime.strptime()- gtlambert