I am working with a variable for race that takes on the following values:1 Black, 2 Hispanic, 3 Mixed Race (Non-Hispanic), 4 Non-Black / Non-Hispanic. I want to sum up 3 and 4 and make it the base category and keep Black and Hispanic. I tried to create 2 dummies (Black=1 and other Hispanic=1) and 2 extra columns are created, but the values in them are not 1
and 0
, but False
and True
. The code I used:
nlsy2$Hispanic <- nlsy2$Race==2
nlsy2$Black <- nlsy2$Race==1
nlsy2$Race [ nlsy2$Race == 0 ] <- 3
nlsy2$Race [ nlsy2$Race == 0 ] <- 4
Also when I run summary(nlsy2$Hispanic)
R gives me this output:
Mode FALSE TRUE NA's
logical 5594 1526 0
Are the NA's problematic when running a glm? Also, if you have a better code solution in how I can recode the race variable, it would be much appreciated! Thank you!
nlsy2$Hispanic <- (nlsy2$Race == 2) + 0
– Adam Queklevels
function in R , refer to [link] stackoverflow.com/questions/9604001/… , and why do you need to convert to dummy for modelling and not use themas.factor
? For NA you can always includena.action = na.exclude
in your code and based on data you can always consider imputing it usingmice
package – Learner_seeker