2
votes

I am trying to use the fct_recode() command from the forcats package to relabel all the observations in a column of one dataframe to merge with another dataframe. The column I was recoding was a list of country names from a UN dataset. I coerced it to a factor and then recoded it, but for one of the country names I received the error :

Unknown levels in f: Korea, Dem. People�s Rep.`

It seems to not recognize the apostrophe in the country name. I used the anti_join() and unique() commands to find which entries were not matches and even copy and pasting the name "Korea, Dem. People’s Rep." to the fct_recode gave the same error. It seems to relate to the formatting of the apostrophe, as indicated by the difference in the following lines of code:

undata <- mutate(undata, country_name=as.factor(country_name))
undata <- mutate(undata, country_name=fct_recode(country_name, 
                                 "Korea_North"="Korea, Dem. People's Rep."))
# versus
undata <- mutate(undata, country_name=fct_recode(country_name, 
                                  "Korea_North"="Korea, Dem. People’s Rep."))`

Copy and pasting either of these seemingly differently formatted apostrophes yields the same error though.

I'm not sure how to recode it with the "correct" apostrophe.

I'm using version 3.4.3 of R for Windows 10 and tidyverse 1.2.1.

1
Before any of the code above run gsub("’", "'", country_name) - G5W
Even running gsub("’", "'", undata$country_name) first yields the same error. - Willdebras

1 Answers

0
votes

I've run into the same problems with commas in fct_recode and have run this code beforehand, modifying a bit for your issue:

  undata <- undata %>%
  mutate(country_name = as.character(country_name)) %>%
  mutate(country_name = str_replace_all(country_name,
    pattern = fixed("'"), 
    replacement = "")) %>%
  mutate(country_name = as.factor(country_name))