3
votes

I am trying to convert a numeric vector with discreete values into a factor in R.

x <- c(1,2,3,4,8,9,10,88,89,90)

I need this vector to be converted into a factor variable with 4 levels as follows:

1,2 (level 1)

3,4 (level 2)

8,9,10 (level 3)

88,89, 90 (level 4)

I have tried using factor in R as follows:

y <- factor(x, levels = c(1:2, 3:4, 8:10, 88:90))

This returns a factor with 10 levels instead of a factor with 4 levels that I want.

str(y)
Factor w/ 10 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 9 10

I have also tried using cut as follows:

bins <-  c(1,3,5,8,11,88,90)
y <- cut(x, breaks = bins, right = F, include.lowest = T)
table(y)

This also does not return the desired result as it creates a level for ranges such as [5-8) and [11-88) that I dont need.

y
  [1,3)   [3,5)   [5,8)  [8,11) [11,88) [88,90] 
      2       2       0       3       0       3 

Is there a way to convert a range of numeric values into a factor in R?

2
Drop unused levels: table(droplevels(y))zx8754
factor(findInterval(x, c(3,8,88)))?27 ϕ 9
Or maybe level <- cut(x, breaks = c(-Inf, 2, 4, 10, Inf), labels = paste("level", 1:4), right = TRUE); aggregate(x~level, FUN = toString) ? (you might not need the aggregate step but not sure)markus

2 Answers

1
votes

Drop unused levels:

# as per your code    
bins <-  c(1,3,5,8,11,88,90)
y <- cut(x, breaks = bins, right = FALSE, include.lowest = TRUE)
levels(y)
# [1] "[1,3)"   "[3,5)"   "[5,8)"   "[8,11)"  "[11,88)" "[88,90]"

# drop unused levels
y1 <- droplevels(y)
levels(y1)
#[1] "[1,3)"   "[3,5)"   "[8,11)"  "[88,90]"
1
votes

We can use case_when

library(dplyr)
case_when(x %in% 1:2 ~ 1, x %in% 3:4 ~ 2, x %in% 8:10 ~ 3, x%in% 88:90 ~ 4)