I have a data files containing year, day of the year (DOY), hour and minutes as following:
BuoyID Year Hour Min DOY POS_DOY Lat Lon Ts
0 300234065718160 2019 7 0 216.2920 216.2920 58.559 -23.914 14.61
1 300234065718160 2019 9 0 216.3750 216.3750 58.563 -23.905 14.60
2 300234065718160 2019 10 0 216.4170 216.4170 58.564 -23.903 14.60
3 300234065718160 2019 11 0 216.4580 216.4580 58.563 -23.906 14.60
4 300234065718160 2019 12 0 216.5000 216.5000 58.561 -23.910 14.60
In order to make my datetime, I used:
dt_raw = pd.to_datetime(df_buoy['Year'] * 1000 + df_buoy['DOY'], format='%Y%j')
# Convert to datetime
dt_buoy = [d.date() for d in dt_raw]
date = datetime.datetime.combine(dt_buoy[0], datetime.time(df_buoy.Hour[0], df_buoy.Min[0]))
My problem arises when the hours are not int, but float instead. For example:
BuoyID Year Hour Min DOY POS_DOY Lat Lon BP Ts
0 300234061876910 2014 23.33 0 226.972 226.972 71.93081 -141.0792 1016.9 -0.01
1 300234061876910 2014 23.50 0 226.979 226.979 71.93020 -141.0826 1016.8 3.36
2 300234061876910 2014 23.67 0 226.986 226.986 71.92968 -141.0856 1016.8 3.28
3 300234061876910 2014 23.83 0 226.993 226.993 71.92934 -141.0876 1016.8 3.22
4 300234061876910 2014 0.00 0 227.000 227.000 71.92904 -141.0894 1016.8 3.18
What I tried to do was to convert the hours in str, get the first two indexes, thus obtaining the hour, and then subtract this from the 'Hour' and multiply by 60 to get minutes.
int_hour = [(int(str(i)[0:2])) for i in df_buoy.Hour]
minutes = map(lambda x, y: (x - y)*60, df_buoy.Hour, int_hour)
But, of course, if you have '0.' as your hour, Python will complain:
ValueError: invalid literal for int() with base 10: '0.'
My question is: does anyone know a simple way to convert year, DOY, hour (either int or *float) and minutes to datetime in a simple way?
float
first, then useint
. – greentec