In the R-devel list, Martin Maechler posted a message about duplicated levels in factors
"factors with non-unique (duplicated) levels have been deprecated since 2009 -- are more deprecated now ..." June 4, 2016
It states that in R 3.4, scheduled for April 2017, duplicated levels will cause an error, no longer just a warning.
I wonder why does the levels function not cause a similar warning? Here I combine the first three levels as "a" in two ways, one deprecated.
Example
> x <- c("a", "b", "c", "d")
> xf <- factor(x, levels = c("a", "b", "c", "d"),
labels = c("a", "a", "a", "d"))
Warning message:
In `levels<-`(`*tmp*`, value = if (nl == nL)
as.character(labels) else paste0(labels, :
duplicated levels in factors are deprecated
> xf <- factor(x)
> levels(xf) <- c("a", "a", "a", "d")
> xf
[1] a a a d
Levels: a d
I would like to understand why the latter is treated differently by R than the former.
This is the documented behavior of levels, I'm not exploiting an unstated element. In ?levels, there is an example in which duplicated levels are allowed. I'll paste it in to save you the lookup.
## combine some levels
z <- gl(3, 2, 12, labels = c("apple", "salad", "orange"))
z
levels(z) <- c("fruit", "veg", "fruit")
z