2
votes

I am trying to produce a plot with age in the x-axis, expected serum urate in the y-axis and lines for male/white, female/white, male/black, female/black, using the estimates from the lm() function.

goutdata <- read.table("gout.txt", header = TRUE)
goutdata$sex <- factor(goutdata$sex,levels = c("M",  "F"))
goutdata$race <- as.factor(goutdata$race)

fm <- lm(su~sex+race+age, data = goutdata)
summary(fm)
ggplot(fm, aes(x= age, y = su))+xlim(30, 70) + geom_jitter(aes(age,su, colour=age)) + facet_grid(sex~race)

I have tried using the facet_wrap() function with ggplot to address the categorical variables, but I am wanting to create just one plot. I was trying a combination of geom_jitter and geom_smooth, but I am not sure how to use geom_smooth() with categorical variables. Any help would be appreciated.

Data: https://github.com/gdlc/STT465/blob/master/gout.txt

2
Do you need to set color based on age? Because the most straightforward way might be creating lines / points / whatever and setting their color / linetype / shape based on sex and racecamille
I don't need to, no.Hannah

2 Answers

3
votes

We can use interaction() to create groupings on the fly and perform the OLS right within geom_smooth(). Here they are grouped on one plot:

ggplot(goutdata, aes(age, su, color = interaction(sex, race))) +
  geom_smooth(formula = y~x, method="lm") +
  geom_point() +
  hrbrthemes::theme_ipsum_rc(grid="XY")

enter image description here

and, spread out into facets:

ggplot(goutdata, aes(age, su, color = interaction(sex, race))) +
  geom_smooth(formula = y~x, method="lm") +
  geom_point() +
  facet_wrap(sex~race) +
  hrbrthemes::theme_ipsum_rc(grid="XY")

enter image description here

You've now got a partial answer to #1 of https://github.com/gdlc/STT465/blob/master/HW_4_OLS.md :-)

1
votes

You could probably use geom_smooth() to show regression lines?

dat <- read.table("https://raw.githubusercontent.com/gdlc/STT465/master/gout.txt", 
                   header = T, stringsAsFactors = F)

library(tidyverse) 

dat %>%
  dplyr::mutate(sex = ifelse(sex == "M", "Male", "Female"),
                race = ifelse(race == "W", "Caucasian", "African-American"),
                group = paste(race, sex, sep = ", ")
                ) %>%
  ggplot(aes(x = age, y = su, colour = group)) +
  geom_smooth(method = "lm", se = F, show.legend = F) +
  geom_point(show.legend = F, position = "jitter", alpha = .5, pch = 16) +
  facet_wrap(~group) +
  ggthemes::theme_few() +
  labs(x = "Age", y = "Expected serum urate level")

enter image description here