1
votes

I've plotted a Kaplan Meier using ggsurvplot on a survift object and I am trying to plot another curve on top. The line is a vector of mean survival times. My KM is estimated from IPD with 225 observations with a maximum survival time of 7.2 years and my mean survival times are estimated from a Bayesian analysis with samples drawn from the posterior survival function at 6 month intervals. The problem seems to be that the additional line I'm trying to plot from the mean survival times is not the same length is my IPD dataframe, aside from dividing the last survival time by number of observations in my IPD and resampling at these intervals from the posterior distribution I'm wondering if there is anyway around this?

My data looks like this,

## IPD data

   treatment      t event
2          1 5.5250     1
3          1 1.9493     1
4          1 4.9473     1
7          1 5.9466     0
11         1 1.5797     1
12         1 0.5038     1
.              .        .
.              .        . 
.              .        .

## mean survival times from the Bayesian analysis

1.0000000 0.9129731 0.8337045 0.7614860 0.6956758 0.6356917 .....

The code I'm trying,

f <- survfit(Surv(t, event)~1, data=treatment)
f1 <- ggsurvplot(f) 
f2 <- f1$plot + geom_line(aes(c(0:16), meansurv))

Doing this I get the following error,

<error/rlang_error>
Aesthetics must be either length 1 or the same as the data (225): x and y

Additional question, I'm not sure how produce a plot with only 1 legend, the following produces two, if I use legend="none", both are removed.

meansurv1 <- c(1.0000000, 0.9129731, 0.8337045, 0.7614860, 0.6956758, 0.6356917, 0.50, 0.43, 0.37)
meansurv2 <- c(1.0000000, 0.9324888, 0.8671987, 0.8042297, 0.7436717, 0.6856045, 0.6300962, 0.5772029, 0.5269681)
x<- c(0:8)
temp <- data.frame(x, meansurv1, meansurv2)

temp<- temp %>% 
  gather(key, value, -c(x))


f1 <- survfit(Surv(t, event)~1, data=control)
f1 <- ggsurvplot(f1, legend="right") 
f1 <- f1$plot +  geom_line(data = temp, aes(x=x, y=value, group=key, color=key));f1
1

1 Answers

1
votes

the problem is in the use of geom_line. Since you haven't specified the data argument inside geom_line it is NULL. In the help of geom_line it is stated that the data is inherited from the plot data if data = NULL inside geom_line.

You can specify the data argument inside geom_line to get rid of the problem.

Attached there is a reproducible example. I took the numbers given in your question, but I've adjusted the vector meansurv so that this vector has more entries than the number of rows of the data frame treatment.

First I have reproduced your error and then I've shown how one can do it right using the argument data of the function geom_line()

library(survival)
library(survminer)
#> Lade nötiges Paket: ggplot2
#> Lade nötiges Paket: ggpubr

# a preview of the data frame. 
treatment <- data.frame(
  treatment = c(1, 1, 1, 1, 1, 1), 
  t = c(5.525, 1.9493, 4.9473, 5.9466, 1.5797, 0.5038), 
  event = c(1, 1, 1, 0, 1, 1)
  )

# modified vector: 
meansurv <- c(1.0000000, 0.9129731, 0.8337045, 0.7614860, 0.6956758, 0.6356917, 0.50, 0.43, 0.37)

f <- survfit(Surv(t, event)~1, data=treatment)
f1 <- ggsurvplot(f); f1


# data is not used in geom_line ->> raises error: 
f2 <- f1$plot + geom_line(aes(seq.int(meansurv) - 1, meansurv)); f2
#> Error: Aesthetics must be either length 1 or the same as the data (7): x and y


# using argument data: 
f2 <- f1$plot + geom_line(data = data.frame(x = seq.int(meansurv) - 1, 
                                            y = meansurv), 
                          aes(x = x, y = y)); f2

Created on 2020-07-21 by the reprex package (v0.3.0)