1
votes

I am sorry, this is a stupid simple question but I have tried all the solutions I have found online to no avail. This is also my first post here and I have tried to follow the rules with regards to formatting. Ridiculously, I have already achieved exactly what I wanted, saved the plot as a png, then when I returned to the code a few weeks later it was not working, and now I cannot replicate what I had.

I have tried to give some example data here (borrowing some made-up data from this website – I hope that is ok).

tempEf <- data.frame(
  N = rep(c("1", "2","1", "2","1", "2","1"), each=5, times=11),
  Myc = rep(c("1", "2", "3", "4", "5"), each=1, times=77),
  TRTYEAR = runif(385, 1, 15),
  site = rep(c(1:77), each=5, times=1),#77 sites
  Asp = runif(385, 1, 5)
)

# Make up some response data
tempEf$r <- 2*tempEf$TRTYEAR +                   
  -8*as.numeric(tempEf$Myc=="1") +
  4*as.numeric(tempEf$N=="1") +
  0.1*tempEf$TRTYEAR * as.numeric(tempEf$N=="1") +
  0.2*tempEf$TRTYEAR*as.numeric(tempEf$Myc=="1") +
  -11*as.numeric(tempEf$Myc=="1")*as.numeric(tempEf$N=="1")+ 
  0.5*tempEf$TRTYEAR*as.numeric(tempEf$Myc=="1")*as.numeric(tempEf$N=="1")+ 
  as.numeric(tempEf$site) +  #Random intercepts; intercepts will increase by 1
  tempEf$TRTYEAR/10*rnorm(385, mean=0, sd=2)    #Add some noise
#fit model
library(lme4)
model <- lmer(r ~ Myc * N + TRTYEAR + Asp + (1|site), data=tempEf)
tempEf$fit <- predict(model)   #Add model fits to dataframe

I am aiming to:

  1. Calculate fitted values and 95% confidence intervals from the lmer model

  2. Plot the fitted values ("fit") against my dependent variable ("r") separately for the 2 levels of " Myc", coloured according to Myc. I want to ignore N and Asp for the purposes of this plot (in my real data, these are control variables, which are significant in the model but not of interest)

  3. add my 95% confidence intervals to these 2 lines

All this seems simple except it is going very wrong!

I obtain my fitted values and 95% CI's here, which gives me fit, upr and lwr:

predicted_EF<-predictInterval(model, tempEf)

I then add them to my original data frame:

tempEf<-cbind(tempEf,predicted_EF)

Then I do this:

ggplot(tempEf,aes(TRTYEAR, r, group=Myc, col=Myc )) + 
  geom_line(aes(y=fit, lty=Myc), size=0.8) +
  geom_point(alpha = 0.3) + 
  theme_bw()

This gives me jagged lines, as below: crappy graph

I can use geom_smooth instead of geom_line, which gives smooth lines, but I believe this is fitting the lines to the raw data, not the model fit values? I can also fit separate regression lines (using the fit variable) for each level of Myc using geom_abline, but not sure that is right either.

ggplot(tempEf,aes(TRTYEAR, r, group=Myc, col=Myc, fill = Myc)) + 
  geom_smooth(method="lm",se = FALSE)+
  geom_point(alpha = 0.3)+
  theme_bw()

Then trying to add the 95% confidence intervals using my upr and lwr variables results in jagged confidence ribbons:

ggplot(tempEf,aes(TRTYEAR, r, group=Myc, col=Myc, fill = Myc)) + 
  geom_smooth(method="lm",se = FALSE)+
  geom_point(alpha = 0.3) +
  geom_ribbon(alpha=0.1,
              aes(ymin=lwr,ymax=upr,fill = Myc, colour = NA))+
  theme_bw()

How can I get smooth lines with smooth confidence intervals? What am I doing wrong (a lot, I am sure!).

Thank you for your help.

2
In short, it's a complicated question! The jaggy lines are from predict with all the covariates, so each point incorporates them in calculating a prediction and so it's not a smooth line between your x and y variables. My best guess for next steps would be to create a 'dummy' dataset, with evenly spaced time variables, your two grouping variables, and then manually calculate the dependent from your model's coefficients in summary(model)$coefficients - adding half/mean of the categorical variable coefficients. Messy, but accurate.Andrew Baxter
ah thank you! Could I ask, what would be the difference between your approach and doing a linear regression lm(tempEf$fit~tempEf$TRTYEAR) for each level of Myc separately and plotting those lines over the data points in the original graph? I'm struggling to get my head around it, sorry!LauraMai
Have posted an example below to do the plotting each separately, in case that's helpful.Andrew Baxter

2 Answers

2
votes

I think this is a "classical" task for effecs plots (or estimated marginal means). You can do this easily with the ggeffects-package, there are plenty of examples on the website.

Based on your data, you would simply call ggpredict(model, c("TRTYEAR", "Myc")):

library(ggeffects)
pred <- ggpredict(model, c("TRTYEAR", "Myc"))
pred
#> 
#> # Predicted values of r
#> # x = TRTYEAR
#> 
#> # Myc = AM
#>   x predicted std.error conf.low conf.high
#>   0     0.797     0.737   -0.647     2.241
#>   2     5.361     0.727    3.936     6.786
#>   6    14.489     0.716   13.085    15.892
#>   8    19.052     0.715   17.652    20.453
#>  10    23.616     0.716   22.213    25.020
#>  16    37.308     0.737   35.863    38.752
#> 
#> # Myc = ECM
#>   x predicted std.error conf.low conf.high
#>   0    -5.575     0.737   -7.019    -4.130
#>   2    -1.011     0.727   -2.436     0.415
#>   6     8.117     0.716    6.713     9.520
#>   8    12.681     0.715   11.280    14.081
#>  10    17.244     0.716   15.841    18.648
#>  16    30.936     0.737   29.492    32.380
#> 
#> Adjusted for:
#> *    N = Nhigh
#> *  Asp =  2.99
#> * site = 0 (population-level)

plot(pred)
#> Loading required namespace: ggplot2

plot(pred, add.data = TRUE)

Created on 2019-12-11 by the reprex package (v0.3.0)

0
votes

The ggeffects package looks super and is well worth checking out. In response to your question about putting multiple lines for each level of Myc separately the interaction function in a ggplot(aes(group = )) call is always a handy tool to do this quickly. In your case, you included four categorical variables, one of which is coded by colour. To split the other three to give straight lines and ribbons for each (under each subgroup):

ggplot(tempEf, aes(TRTYEAR, r, group = interaction(site, N, Myc), col=Myc, fill = Myc)) +
  geom_point(alpha = 0.3) +
  geom_line(aes(y = fit)) +
  geom_ribbon(aes(ymin = lwr, ymax = upr), alpha = 0.1, colour = NA)

Gives a set of lines and ribbons representing every Myc x site x N grouping. I think from what you're asking it's the other output (from ggeffects) that you want, but in case this is a helpful tool nonetheless:

enter image description here