This should be simple, but took me too many rows to complete:
I have a pandas data frame indexed by id and year.
For the groups A and B, I want to backfill only the values of 2022 to 2021 (2019 values should be left untouched). See table below.
This worked in the end, but cumbersome:
df = df.swaplevel()
# create a frame with backfilled values
dfbf = df.loc[[2021,2022]].groupby('id')['value'].bfill(limit=1).to_frame()
df = df.join(dfbf, rsuffix = '_fill')
df = df.reset_index(level = "id")
df['value'].loc[2021] = df.loc[2021][['value', 'value_fill']].sum(axis = 1)
df = df.set_index(['id'], append=True).swaplevel()
Example:
value | value | |||||
---|---|---|---|---|---|---|
id | year | id | year | |||
A | 2019 | A | 2019 | |||
A | 2020 | 1 | A | 2020 | 1 | |
A | 2021 | A | 2021 | 3 | ||
A | 2022 | 3 | A | 2022 | 3 | |
B | 2019 | B | 2019 | |||
B | 2020 | 12 | B | 2020 | 12 | |
B | 2021 | B | 2021 | 11 | ||
B | 2022 | 11 | B | 2022 | 11 |