I want to find the least square means for a dataset with two categorical variables. They are gender and above/below 55 years of age. The values in the matrix are number of hrs spent watching tv.
I want to find the least squares means of both Age55yr and Gender. Problem is that lsmeans finds the means of the categorical variables too (they are represented as 1 or 2). So instead of getting one row for 1 (male) and 2 (female) I get one averaged row (with the value 1.51).
The output of > lsmeans(tv_age_lm, ~ Gender)
is:
$`Gender lsmeans`
Gender lsmean SE df lower.CL upper.CL
1.514563 29.59223 0.4416212 100 28.71607 30.4684
What I expected was something like:
$`Gender lsmeans`
Gender lsmean SE df lower.CL upper.CL
1 29.59223 0.4416212 100 28.71607 30.4684
2 29.59223 0.4416212 100 28.71607 30.4684
That is, I expected that my categorical variables would be left intact in a separate row, instead of averaged. How do I achieve this?
This is the code needed to reproduce the error:
install.packages("lsmeans", repos="http://cran.rstudio.com/")
library(lsmeans)
tvfile <- read.csv2("TVwatch.csv", header=TRUE)
tv_age_lm = lm(TVhrs ~ Age55yr + Gender, data=tvfile)
lsmeans(tv_age_lm, ~ Age55yr)
lsmeans(tv_age_lm, ~ Gender)
The datafile is here: http://textuploader.com/1u27