1
votes

My data has levels that are theoretically possible but not present in the data. I can easily represent this in base R:

factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))

If I table it, I see that test3 is 0. This is great and allows for the possibility that I can write functions assuming these levels include all possible outcomes in case data is eventually added that includes this level.

I can not replicate this in forcats. First off, the as_factor function does not accept any additional arguments:

forcats::as_factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))

The above produces an error.

The following works with a warning (which I would prefer to accomplish my goal without warnings, if possible):

forcats::as_factor(c("test","test1","test2")) %>% forcats::fct_recode(`test` = "test", `tests` = "test1", `tests` = "test2", `tests` = "test3")

Warning message:
Unknown levels in `f`: test3 

Is there any way in forcats to play with levels that theoretically exist but are not necessarily in the data at that moment?

2

2 Answers

1
votes

If we want to replicate the same behavior with factor, may be use fct_expand

c("test","test1","test2") %>%
       forcats::fct_expand(c("test","test1","test2","test3"))
#[1] test  test1 test2
#Levels: test test1 test2 test3

Regarding the use of ... (other arguments in as_factor), it is not actually used

library(forcats)
methods(as_factor)
#[1] as_factor.character* as_factor.factor*    as_factor.logical*   as_factor.numeric*  

Now, we check the code of as_factor.character

getAnywhere(as_factor.character)
function (x, ...) 
{
    structure(fct_inorder(x), label = attr(x, "label", exact = TRUE))
}

The fct_inorder takes only the 'x' and not any other arguments passed in with ...

Here, we can use fct_expand directly to expand the levels of the factor or character (converts to factor)

1
votes

Per the advice of @Akrun, I have accomplished what I was seeking to do. See an example below:

test <- c("fruit","fruit","apple","drink","meat")
levels <- c(
  `Fruit` = "fruit",
  `Fruit` = "apple",
  `Drink` = "drink",
  `Vegetable` = "vegetable",
  `Meat` = "meat"
)

factor(test) %>% table()
factor(test) %>% forcats::fct_expand(levels) %>% table()
factor(test) %>% forcats::fct_expand(levels) %>% forcats::fct_recode(!!!levels) %>% table()