4
votes

I have a feature in a df with some missing values which are showing as just "".

unique(page_my_df$Type)
[1] "list"              "narrative" "how to"            "news feature"     
[5] "diary"     ""                  "interview" 

I want to replace all instances of "" with "unknown".

page_my_df <- page_my_df %>% 
  mutate(Type = str_replace(.$Type, "", "unknown"),
         Voice = str_replace(.$Voice, "", "unknown"))

Error in mutate_impl(.data, dots) : Evaluation error: Not implemented.

Read some documentation here, specifically under pattern:

Match character, word, line and sentence boundaries with boundary(). An empty pattern, "", is equivalent to boundary("character").

So I tried:

page_my_df <- page_my_df %>% 
  mutate(Type = str_replace(.$Type, boundary(""), "unknown"),
         Voice = str_replace(.$Voice, boundary(""), "unknown"))

Which then gave:

Error in mutate_impl(.data, dots) : Evaluation error: 'arg' should be one of “character”, “line_break”, “sentence”, “word”.

How can I replace empty character strings with "unknown" within dplyr::mutate()?

2
you could use a nchar() for the test instead. for "" it is 0 for all the other notLinus
Why can't you use mutate(Type=ifelse(Type=="", NA, Type)) or even na_if(Type, "")dmi3kno

2 Answers

5
votes

Here is one approach:

library(tidyverse)
library(stringr)

z <- c( "list",  "narrative",  "how to",  "news feature",  
"diary",  "" , "interview" )

data.frame(element = 1:length(z), Type = z) %>%
  mutate(Type = str_replace(Type, "^$", "unknown"))
#output
  element         Type
1       1         list
2       2    narrative
3       3       how to
4       4 news feature
5       5        diary
6       6      unknown
7       7    interview

Also there is no need to refer to the data frame in mutate call with .$

^ and the dollar sign $ are metacharacters that respectively match the empty string at the beginning and end of a line.

2
votes

Another solution by checking the length of the string:

library(dplyr)

strings <- c("list","narrative","how to","news feature","diary","","interview" )
df <- data.frame(ID = 1:length(strings), strings, stringsAsFactors = FALSE)

> df
  ID      strings
1  1         list
2  2    narrative
3  3       how to
4  4 news feature
5  5        diary
6  6             
7  7    interview

df <- df %>% mutate(strings = if_else(nchar(strings) == 0, "unknown", strings))

> df
  ID      strings
1  1         list
2  2    narrative
3  3       how to
4  4 news feature
5  5        diary
6  6      unknown
7  7    interview