0
votes

I would like to mutate several variables at once using mutate_at(). This is how I've been doing up until now, but since I'm dealing with a long list of variables to recode/rename, I want to know how I can do this using mutate_at(). I want to maintain the original columns, which is why I'm not using rename() but mutate() instead. This is what I normally do:

df <- df %>% 
mutate(q_50_a = as.numeric(`question_50_part_a: very long very long very long very long` == "yes"),
       q_50_b = as.numeric(`question_50_part_b: very long very long very long very long` == "yes"),
       q_50_c = as.numeric(`question_50_part_c: very long very long very long very long` == "yes"))

This is what I have so far:

df <- df %>% mutate_at(vars(starts_with("question_50")), funs(q_50 = as.numeric(. == "yes")))

It works and creates a new numeric variable but I'm not sure how to get it to rename the new variables like this: q_50_a, q_50_b, q_50_c, ect.

Thank you.

edit: this is what the data looks like (except there are many many more columns which all look alike)

question_50_part_a: a very long title   question_50_part_b: a very long title
yes                                               yes
yes                                               no
yes                                               no
yes                                               yes
no                                                no
yes                                               yes

but would like this:

q_50_a   q_50_b
1         1
1         0
1         0
1         1
0         0
1         1

but I want to keep the original columns as they are and simply mutate these new columns with the shorter name and numeric binary coding.

2

2 Answers

0
votes

We can use rename_at to rename the new columns.

library(dplyr)

df %>%
  mutate_at(vars(starts_with('question_50')), 
                    list(new = ~as.numeric(. == 'yes'))) %>%
  rename_at(vars(ends_with('new')), 
                   ~sub('\\w+(_\\d+)_part(\\w+):.*', 'q\\1\\2',  .))



#     question_50_part_a: a very long title question_50_part_b: a very long title
#1                                   yes                                   yes
#2                                   yes                                    no
#3                                   yes                                    no
#4                                   yes                                   yes
#5                                    no                                    no
#6                                   yes                                   yes

#  q_50_a q_50_b
#1      1      1
#2      1      0
#3      1      0
#4      1      1
#5      0      0
#6      1      1
0
votes

Here is an approach that loops over each column:

column_names = colnames(df)
# optional filter out column names you don't want to change here

for(col in column_names){
   # construct replacement name
   col_replace = paste0("q_", substr(col, 10, 11), "_", substr(col, 18, 18))
   # assign and drop old column
   df = df %>%
      mutate(!!sym(col_replace) := ifelse(!!sym(col) == "yes", 1, 0)) %>%
      select(-!!sym(col))
}

Points to note:

  • If you have other columns you don't want changed, be sure to exclude them
  • The !!sym(col) construction takes the text string stored in col and turns it into a column name.
  • We use := rather than = because the LHS requires some evaluation before assignment can happen.
  • I have used ifelse instead of as.numeric but you can code the RHS of the equals sign as you please.
  • creating col_replace makes some assumptions about the format of your input names. If everything is the same length this should work. If the number of characters differ (e.g. Q_9_a and Q_10_a) then you may want to use a method based on strsplit instead.
  • The - sign in select makes it exclude the specified column