5
votes

What I have is a Kaplan-Meier Analysis of patients with mechanical heart support using R.

What I need is adding the following data into the plot (like in the example):

  • patients who survived due to a heart transplantation (HTX)
  • patients who died

In other words, there are two groups where one is a subset (transplanted patients) of the other (all patients). These two curves must start at 0/0 and will increase.

My own plot is done by:

pump <- read.table(file=datafile, header=FALSE,
                   col.names=c('TIME', 'CENSUS', 'DEVICE'))
# convert days to months
pump$TIME <- pump$TIME/(730/24)
mfit.overall <- survfit(Surv(TIME, CENSUS==0) ~ 1, data=pump)
plot(mfit.overall, xlab="months on device", ylab="cum. survival", xaxt="n")
axis(1, at=seq(from=0, to=24, by=6), las=0)

How may I add the two additional curves?

Kind Regards Johann

Sample Kaplan Meier Curve: http://i.stack.imgur.com/158e8.jpg

Demo Data:

Survival Data, which goes into pump:

TIME    CENSUS  DEVICE
426     1       1
349     1       1
558     1       1
402     1       1
12      0       1
84      0       1
308     1       1
174     1       1
315     1       1
734     1       1
544     1       2
1433    1       2
1422    1       2
262     1       2
318     1       2
288     1       2
1000    1       2

TX data:

TIME    CENSUS  DEVICE
426     1        1
288     1        2
308     1        1

deaths:

TIME    CENSUS  DEVICE
12      0        1
84      0        1
2
Can you provide the data you use for the plot or dummy data on the same form, in case you cannot share the actual data?Backlin
To be honest, even though par(new=TRUE) is a useful hack, I think @Gavin has the better answer. If I had known about it I would also have solved it his way.Backlin

2 Answers

7
votes

With par(new=TRUE) you can draw a second plot in the same figure as the first.

Normally I would recommend using lines() for adding curves to a plot, since par(new=TRUE) performs the filosophically different task of overlaying plots. When using functions in a way they are not intended to be used you risk commiting misstakes, e.g. I nearly forgot the vital xlim argument. However, it is not trivial to extract the curves from the survfit objects, so I figured it was the lesser of two evils.

# Fake data for the plots
pump <- data.frame(TIME=rweibull(40, 2, 20),
                   CENSUS=runif(40) < .3,
                   DEVICE=rep(0:1, c(20,20)))
# load package
library("survival")

# Fit models
mfit.overall <-survfit(Surv(TIME, CENSUS==0) ~ 1, data=pump)
mfit.htx <- survfit(Surv(TIME, CENSUS==0) ~ 1, data=pump, subset=DEVICE==1)

# Plot
plot(mfit.overall, col=1, xlim=range(pump$TIME), fun=function(x) 1-x)
# `xlim` makes sure the x-axis is the same in both plots
# `fun` flips the curve to start at 0 and increase
par(new=TRUE)
plot(mfit.htx, col=2, xlim=range(pump$TIME), fun=function(x) 1-x,
    ann=FALSE, axes=FALSE, bty="n") # This hides the annotations of the 2nd plot
legend("topright", c("All", "HTX"), col=1:2, lwd=1)

enter image description here

7
votes

No need for plot.new() (though this is a nice illustartion of that paradigm). This can all be achieved via the lines() method for class "surfit".

plot(mfit.overall, col=1, xlim=range(pump$TIME), fun=function(x) 1-x)
lines(mfit.htx, col=2, fun=function(x) 1-x)
lines(mfit.htx, col=2, fun=function(x) 1-x, lty = "dashed", conf.int = "only")
legend("topleft", c("All", "HTX"), col=1:2, lwd=1, bty = "n")

This gives, using @Backlin's example data (but different seeds, hence different data)

enter image description here

The reason for the two calls to lines() is to arrange for the confidence interval to be drawn with a dashed line and I couldn't see a way to pass multiple lty's to lines() (that worked!) in the single lines() call.