1
votes

My model does not continue towards the asymptotes when plotted in ggplot2, though it does in R base graphics. In ggplot2, it stops at certain points on the X axis (images included below), I am 90% certain this is related to seq().

I am using predict() to transform data from a drm (dose-response package) logit model. In the base graphics, the sigmoidal curve looks great, in ggplot2, not so much:

library(drc)
library(ggplot2)

fitting data to logit model:

mod1 <- drm(probability ~ (dose), weights = total, data = mydata1,     type ="binomial", fct = LL.2())

plot(mod1,broken=FALSE,type="all",add=FALSE, col= "purple", xlim=c(0, 10000))

Image of Base graphic 2-parameter logit:

Image of Base graphic 2-parameter logit

Extracting the data of the model using code from the author's demonstration (linked below), I have:

newdata1 <-expand.grid(dose=exp(seq(log(0.5),log(100),length=200)))

pm1<- predict(mod1, newdata=newdata1,interval="confidence")

newdata1$p1 <-pm1[,1]

newdata1$pmin1 <-pm1[,2]

newdata1$pmax1 <- pm1[,3]

And finally the ggplot2 graphic:

p1 <- ggplot(mydata1, aes(x=dose01,y=probability))+
  geom_point()+
   geom_ribbon(data=newdata1, aes(x=dose,y=p1,
 ymin=pmin1,ymax=pmax1),alpha=0.2,color="blue",fill="pink") +
   geom_step(data=newdata1, aes(x=dose,y=p1))+
  coord_trans(x="log") +  #creates logline for x axis
  xlab("dose")+ylab("response")

Images 1&2 and 3&4 show differences in my plot depending on seq:

2

When the following is used for seq(), the graphic gets pulled out past the data! (seq(log(0.5),**log(10000)**,length=200))) (images 1&2)

I don't understand seq() despite researching it. What is going on with my plots?

It appears that the first term in seq is defining the lower bound, but then what is the third term defining? You can see this in images 3&4 - The graphic is decent; I've somewhat obfuscated the issue, but it still doesn't continue towards infin. This is a slight issue because I will be co-plotting 8 logit models.

For anyone having trouble plotting drc/drm models with ggplot2, the following posts were very helpful: Search for plotting-dose-response-curves-with-ggplot2-and-drc
and this title: plotting-dose-response-curves-with-ggplot2-and-drc

I've followed the author of DRC's instructions, which can found in the supporting information of his article - Part of this code was used above. Article title: Dose-Response Analysis Using R, Christopher Ritz. PlosOne.

Data:

> dput(mydata1)

structure(list(dose = c(25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 75L, 75L, 75L, 75L, 75L, 75L, 
75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 75L, 100L, 100L, 100L, 
100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
100L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 150L, 
150L, 150L, 150L, 150L, 150L, 200L, 200L, 200L, 200L, 200L, 200L, 
200L, 200L, 200L, 200L, 200L, 200L, 200L, 200L, 200L), total = c(25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 
25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L, 25L), affected = c(2L, 
0L, 0L, 0L, 0L, 1L, 1L, 2L, 2L, 4L, 1L, 1L, 4L, 0L, 10L, 0L, 
1L, 0L, 1L, 0L, 3L, 0L, 4L, 2L, 0L, 2L, 0L, 1L, 2L, 3L, 2L, 0L, 
2L, 0L, 0L, 4L, 0L, 1L, 2L, 3L, 0L, 21L, 1L, 3L, 1L, 2L, 7L, 
0L, 0L, 0L, 0L, 8L, 7L, 3L, 7L, 2L, 2L, 10L, 3L, 4L, 0L, 7L, 
0L, 3L, 3L, 20L, 25L, 22L, 23L, 22L, 18L, 14L, 20L, 20L, 21L), 
    probability = c(0.08, 0, 0, 0, 0, 0.04, 0.04, 0.08, 0.08, 
    0.16, 0.04, 0.04, 0.16, 0, 0.4, 0, 0.04, 0, 0.04, 0, 0.12, 
    0, 0.16, 0.08, 0, 0.08, 0, 0.04, 0.08, 0.12, 0.08, 0, 0.08, 
    0, 0, 0.16, 0, 0.04, 0.08, 0.12, 0, 0.84, 0.04, 0.12, 0.04, 
    0.08, 0.28, 0, 0, 0, 0, 0.32, 0.28, 0.12, 0.28, 0.08, 0.08, 
    0.4, 0.12, 0.16, 0, 0.28, 0, 0.12, 0.12, 0.8, 1, 0.88, 0.92, 
    0.88, 0.72, 0.56, 0.8, 0.8, 0.84)), .Names = c("dose", "total", 
"affected", "probability"), row.names = c(NA, -75L), class = "data.frame")
2

2 Answers

1
votes

You mistook dose values to give predict() or aes(x) values:

log10000 <- exp(seq(log(0.5), log(10000), length=200))
log1000 <- exp(seq(log(0.5), log(1000), length=200))

log10000df <- as.data.frame(cbind(dose = log10000, predict(mod1, data.frame(dose = log10000), interval="confidence")))
log1000df <- as.data.frame(cbind(dose = log1000, predict(mod1, data.frame(dose = log1000), interval="confidence")))

 ## a common part
p0 <- ggplot(mydata1, aes(x = dose, y = probability)) +
  geom_point() + coord_trans(x="log") + 
  xlab("dose") + ylab("response") + xlim(0.5, 10001)

p10000 <- p0 + geom_line(data = log10000df, aes(x = dose, y = Prediction)) +
  geom_ribbon(data = log10000df, aes(x = dose, y = Prediction, ymin = Lower, ymax = Upper),
              alpha = 0.2, color = "blue", fill = "pink")
  
p1000 <- p0 + geom_line(data = log1000df, aes(x = dose, y = Prediction)) +
  geom_ribbon(data = log1000df, aes(x = dose, y = Prediction, ymin = Lower, ymax = Upper),
              alpha = 0.2, color = "blue", fill = "pink")

enter image description here enter image description here

0
votes

See ?seq for an explanation of what the terms of seq do. seq(log(.5), log(1000), length=200) makes 200 numbers evenly spaced starting from log(0.5) going to log(1000). If you don't name the third argument (or specify by=XYZ), it's the spacing between the numbers.

So when you do seq(log(0.5), log(1000), length=200) it will calculate your fit at 200 points from log(0.5) to log(1000).

I think by "go out to infinity" you mean that you want the line to go off the edge of the plot rather than stopping before the edge as in your linked pictures. By default ggplot will attempt to ensure all your plotted things fit on your plot, so it will extend the axes a little beyond the extent of your data.

If you want to restrict it, just use + xlim(c(lower, upper)).

(I am having troubles installing drc so cannot reproduce your example; here's a toy one)

x = seq(0.5, 100, length=200)
df <- data.frame(x=x, y=x^2)
ggplot(df, aes(x=x, y=y)) + geom_line() + coord_trans(x="log")

original plot

In the above, the line goes out to 100 (as expected) and the limits extend a little beyond. If I wanted the line to touch the edge of the plot, then I can (e.g.) clip the x limits at 100 exactly - use the limx argument to coord_trans:

ggplot(df, aes(x=x, y=y)) + geom_line() + coord_trans(x="log", limx=c(0.5, 100))

clipped

So when you plot your models, decide what the bounds of your x axis will be and make sure you predict all of your models along those values. Then restrict the x limit to those bounds, and the lines will appear to "go to infinity".