1
votes

I have a column of dates that I split into 2 columns of dates, DATE_1 and DATE_2. I have been attempting to find a way to get the week number of year of each date and the day of the week for each date within that range, exclusive of DATE_2 For example:

Date_1      Date_2
2020-09-27  2020-10-01
2020-12-24  2020-12-29
2020-12-24  2021-01-03
2020-12-28  2021-01-03

and I want to get

Date_1      Date_2      Week                       Days
2020-09-27  2020-10-01  39,40,40,40                Sun,Mon,Tues,Wed
2020-12-24  2020-12-29  52,52,52,53                Thurs,Fri,Sat,Sun,Mon
2020-12-24  2021-01-03  52,52,52,53,53,53,53,53,53 Thurs,Fri,Sat,Sun,Mon,Tues,Wed,Thur,Fri,Sat
2020-12-28  2021-01-03  53,53,53,53,53,53          Mon,Tues,Wed,Thur,Fri,Sat

The way the days display can be full names or numeric values that correspond to certain days, the most important thing is that the data exists somewhere.

I know that pandas has date_range but I couldn't figure out how to incorporate that into what I am looking for. Maybe it isn't pandas specific I am not really sure. Any help would be appreciated.

2
That actually looks like it will work, do you know if there is a way to expand the columns to find the isocalendar() data for all the dates not shown?Larry Flores
Looks like this will resolve the issue: stackoverflow.com/questions/7274267/…Abhimanyu Shekhawat

2 Answers

1
votes

With a little help from links in the comments, I've come up with a solution using date_range:

import pandas as pd

x = {'Date_1': {0: '2020-09-27', 1: '2020-12-24', 2: '2020-12-24', 3: '2020-12-28'},
    'Date_2': {0: '2020-10-01', 1: '2020-12-29', 2: '2021-01-03', 3: '2021-01-03'}}

weekdays = {1: "Mon", 2: "Tues", 3: "Wed", 4: "Thur", 5: "Fri", 6: "Sat", 7: "Sun"}

df = pd.DataFrame(x)

# Creates a new column containing all the days between Date_1 and Date_2
df["Week"] = df.apply(lambda row: pd.date_range(start=row["Date_1"], end=row["Date_2"], freq="D"), axis=1)
# Using the days, we collect the weekdays of the days
df["Days"] = df["Week"].apply(lambda dates: [weekdays.get(date.isocalendar()[2]) for date in dates])
# Finally we gather the week-number for all of the days
df["Week"] = df["Week"].apply(lambda dates: [date.isocalendar()[1] for date in dates])

Output:

       Date_1      Date_2                                          Week                                                        Days
0  2020-09-27  2020-10-01                          [39, 40, 40, 40, 40]                                 [Sun, Mon, Tues, Wed, Thur]
1  2020-12-24  2020-12-29                      [52, 52, 52, 52, 53, 53]                            [Thur, Fri, Sat, Sun, Mon, Tues]
2  2020-12-24  2021-01-03  [52, 52, 52, 52, 53, 53, 53, 53, 53, 53, 53]  [Thur, Fri, Sat, Sun, Mon, Tues, Wed, Thur, Fri, Sat, Sun]
3  2020-12-28  2021-01-03                  [53, 53, 53, 53, 53, 53, 53]                       [Mon, Tues, Wed, Thur, Fri, Sat, Sun]
1
votes

You can do it this way:

def f(x):
    didx = pd.date_range(x['Date_1'], x['Date_2'])
    return pd.Series([didx.isocalendar().week.values, 
                      didx.strftime('%a').values], 
                      index=['Weeks', 'Days'])

df[['Weeks', 'Days']] = df.apply(f, axis=1)
  

Output:

      Date_1     Date_2                                         Weeks                                               Days
0 2020-09-27 2020-10-01                          [39, 40, 40, 40, 40]                          [Sun, Mon, Tue, Wed, Thu]
1 2020-12-24 2020-12-29                      [52, 52, 52, 52, 53, 53]                     [Thu, Fri, Sat, Sun, Mon, Tue]
2 2020-12-24 2021-01-03  [52, 52, 52, 52, 53, 53, 53, 53, 53, 53, 53]  [Thu, Fri, Sat, Sun, Mon, Tue, Wed, Thu, Fri, ...
3 2020-12-28 2021-01-03                  [53, 53, 53, 53, 53, 53, 53]                [Mon, Tue, Wed, Thu, Fri, Sat, Sun]