I have a data frame like this,
df
col1 col2 col3
A 2021-02-01 P
B 2021-02-12 P
C 2021-02-08 Q
A 2021-02-04 Q
B 2021-02-14 Q
A 2021-02-15 S
The col2 is pandas datetime type. Now I want to group col3 values on col1 and col2(with date date range of +- 4 days, ex: col3 values between 2021-02-01 and 2021-02-04 should be grouped against col1 values)
So the final data frame should look like,
col1 col3
A [P,Q]
B [P,Q]
C [Q]
A [S]
This could be done using a for loop and checking the date time differences but the execution time will be huge, so looking for pandas shortcuts to do this more efficiently.
with date date range of +- 4 days
, is possible explain more? – jezrael