2
votes

I created a user defined function that will search text for certain values and then return a different value. This works fine for each individual call, however, when I try to use it in Tidyverse, with mutate it doesn't work anymore. I get a Warning:

Warning message:

In if (grepl("Unique", textValue)) { : the condition has length > 1 and only the first element will be used

I'm guessing it has something to do with types and formats but not sure how to solve it.

# create fake data
P1 = c("Unique Claims", "Unique Records", "Spend Today", "Spend Yesterday", "% Returned", "% Claimed")
P2 = as.tibble(P1) 


#create function
assignFormat <- function (textValue = as.character()) {
  if (grepl("Unique", textValue) > 0) {
    numFormat = "Comma"
  } else if (grepl("Spend", textValue) > 0) {
    numFormat = "Currency"
  } else if (grepl("%", textValue, ) > 0 ) {numFormat = "Percent"}
    else numFormat = "Other"

  return(numFormat)
}


#test function - works fine
assignFormat("% of CLaims")
assignFormat("Unique Records")
assignFormat("Total Spend")

#doesn't work
P3 = P2 %>%
     mutate(y = assignFormat(value))

Things I've tried: switching to grep using GREP in mutate directly - creates three vectors instead

Options and help are appreciated!

3

3 Answers

2
votes

To use the same function you could use map variants

library(dplyr)
library(purrr)

P2 %>%  mutate(y = map_chr(value, assignFormat))

# A tibble: 6 x 2
#  value            y       
#  <chr>           <chr>   
#1 Unique Claims   Comma   
#2 Unique Records  Comma   
#3 Spend Today     Currency
#4 Spend Yesterday Currency
#5 % Returned      Percent 
#6 % Claimed       Percent 

You could also change to function to use ifelse instead of if

assignFormat <- function (textValue = as.character()) {
   ifelse(grepl("Unique", textValue), "Comma", 
          ifelse(grepl("Spend", textValue), "Currency", 
              ifelse(grepl("%", textValue),"Percent", "Other")))
}

P2 %>% mutate(y = assignFormat(value))

OR another option is to use case_when which is designed for such operations.

P2 %>%
  mutate(y = case_when(grepl("Unique", value) ~ "Comma", 
                       grepl("Spend", value) ~ "Currency", 
                       grepl("%", value) ~ "Percent", 
                       TRUE ~ "Other"))
2
votes

Many string functions do work as intended in dplyr if you use rowwise grouping

#does work
P3 = P2 %>%
  rowwise() %>% 
  mutate(y = assignFormat(value)) %>% 
  ungroup()
1
votes

Use sapply:

> sapply(P2$value, assignFormat)
  Unique Claims  Unique Records     Spend Today Spend Yesterday      % Returned       % Claimed 
        "Comma"         "Comma"      "Currency"      "Currency"       "Percent"       "Percent" 

To add to the data frame:

P2 %>% 
  mutate(y = sapply(value, assignFormat))
# A tibble: 6 x 2
  value           y      
  <chr>           <chr>   
1 Unique Claims   Comma   
2 Unique Records  Comma   
3 Spend Today     Currency
4 Spend Yesterday Currency
5 % Returned      Percent 
6 % Claimed       Percent 

The error message is actually informative. The function is designed to work on a single element, so we "vectorize" it by using the apply family of functions. Since we expect a single result per input, we use sapply to return a vector of output.