How mutate in Rstudio columns that have same pattern in column names?

Question

I have a question that may be very simple, but I still haven't managed to solve it.

I have a data.frame with multiple columns, with the following structure in the column names: NDVI_20180506, NDVI_20180526, NDVI_20180917, NDVI_20180929, NDVI_20181008, NDVI_20181126 ... I would like to create a new column every time a similar pattern appears in the column name ("NDVI_201805").

For example: NDVI_May (column name) and that the column contains the average of the columns NDVI_20180506, NDVI_20180526.

Phil Phil · Accepted Answer · 2020-07-19T04:42:11

library(dplyr)
library(tidyr)
library(lubridate)

Generating some data:

dat <- tibble(NDVI_20180506 = rnorm(10),
              NDVI_20180526 = rnorm(10),
              NDVI_20180917 = rnorm(10),
              NDVI_20180929 = rnorm(10),
              NDVI_20181008 = rnorm(10),
              NDVI_20181126 = rnorm(10))

dat %>% 
  pivot_longer(everything()) %>% # Turn to long format to manipulate variable names
  separate(name, c("name", "date"), "_") %>% # Separate date from variable name
  mutate(date = ymd(date), # Set to date
         month = format(date, "%B")) %>% # Extract the month's name
  unite(name, name, month, sep = "_") %>% # Merge variable name with month's name
  group_by(name) %>% 
  summarize(value = mean(value)) %>% # Average by variable
  pivot_wider(names_from = name, values_from = value) # Bring back to wide format

  NDVI_May NDVI_November NDVI_October NDVI_September
     <dbl>         <dbl>        <dbl>          <dbl>
1    0.137         0.258       -0.454         0.0115

How mutate in Rstudio columns that have same pattern in column names?

1 Answers