Using the dplyr package in R, I'm trying to make a categorical variable from 3 levels to only 2. I'm using the famous iris data set and trying to turn the class variable (containing: "Iris-versicolor", "Iris-setosa", & "Iris-virginica") into one with only two levels (containing: "Iris-versicolor", "Iris-setosa"). So, I want to create a new data set with I've come up with this:
IRIS_TEST2 <- IRIS_TEST %>%
filter(class != "Iris-virginica")
So, when I try to run a hypothesis test on it:
inference(y = sepal_length, x = class, data = IRIS_TEST2, statistic = "mean", type =
"ci", method = "theoretical", conf_level = .95)
I continue to get an error:
Error: Categorical variable has more than 2 levels, confidence interval is undefined,
use ANOVA to test for a difference between means
Alternatively, I could use a way to append the "x =" to include only "Iris-versicolor" & "Iris-setosa"
inference(y = sepal_length, x = class, data = IRIS_TEST2, statistic = "mean", type =
"ci", method = "theoretical", conf_level = .95)
Any help would be greatly appreciated!
droplevels
? Just in case, see this question. – jazzurro