4
votes

I am trying to recode a character variable with dplyr::recode() and stringr::str_detect(). I realize that this can be done with dplyr::case_when(), as documented here: https://community.rstudio.com/t/recoding-using-str-detect/5141, but I am convinced that there has to be a way of doing it via recode().

Consider this case:

library(tidyverse)
rm(list = ls())

data <- tribble(
  ~id, ~time,
  #--|--|
  1, "a",
  2, "b",
  3, "x"
)

I would like to replace the "x" in the dataframe with a "c" via str_detect() and here's how I'd do it:

data %>% 
 mutate(time = recode(data$time, str_detect(data$time, "x") = "c"))

But that doesn't work:

Error: unexpected '=' in: "data %>% mutate(time = recode(data$time, str_detect(data$time, "x") ="

Apparently R doesn't know what to do with the last =, but I believe it has to be there for the recode function, as demonstrated here:

recode(data$time, "x" = "c")

This executes properly, as does this:

str_detect(data$time, "x")

But this does not:

recode(data$time, str_detect(data$time, "x") = "c")

Is there a way of getting these two functions to work with each other?

1
str_detect returns TRUE or FALSE, not the character you are looking for. Either use gsub or if you want to use str_detect, case_when or ifelse. - phiver
So that is the problem. recode() does not understand what to do with TRUE instead of the actual character, I see. - tc_data

1 Answers

7
votes

If you want as simple as possible for this, I'd use gsub

library(dplyr)
data %>% 
  mutate(time = gsub("x", "c", time))

That eliminates the use of recode and str_detect

If you're dead set on using stringr, then you should use str_replace rather than str_detect:

data %>% 
  mutate(time = str_replace(time, "x", "c"))

If you want to replace the entire value if it contains an 'x', then just add some regex:

data %>% 
  mutate(time = str_replace(time, ".*x.*", "c"))

Breakdown of the regex: .* represents any character (except \n) matching at least 0 times. We put .* both in front and behind the x, so that way if there are any leading or trailing characters from the 'x', they are still captured.