0
votes

I am trying to create a Weibull probability plot for censored data in R. To do so I need a log-log scale for the y axis and a log scale for the x axis.

The y axis is the probability (range 0 to 1) and the x axis is "time in days".

I know that I can create logarithmic axis with log="xy" in the plot() function. But I need a log-log scale for the y axis.

Is there a way to do that?

Thank you in advance!

Data example: data$

              X cens survCount   medianRank
       136.5424    1        10   0.09090909
       181.9756    1         9   0.18181818
       192.4309    1         8   0.27272727
       216.6145    1         7   0.36363636
       224.3097    0         6   NA
       254.4997    0         5   NA
       285.1438    1         4   0.49090909
       289.3991    1         3   0.61818182
       295.9161    0         2   NA
       309.9522    0         1   NA

X: times till failure
cens: binary, 0 if censored
survCount: number of things alive before failure/censoring
medianRanks: cumulated probability of failure

Explanation:
So X is what I want on the log x axis and the medianRanks are what I want on the log-log y axis.

The problem is that you can't calculate twice the logarithm from a number <1 becaue the first logarithm will give a negative number and you can't calculate a logarithm from a negative number. That is why I want to transform the axis and not the values.

What I did so far:
My workaround so far is to multiply my y values with a number (e.g. 1000) so that I don't have any values that are less than 1 and then calculate the log-log of these values. I then plot them, hide the axes and add new axes with the axis() function.

data$medianRank <- data$medianRank*1000  
loglogY <- log(log(data$medianRank))  
logX <- log(data$X)  

plot(logX, loglogY, yaxt="n", xaxt="n")  

ylabels <- c(0.1, 0.2, 0.5, 0.7, 0.99)
yAt <- log(log(ylabels*1000))
axis(2, at=yAt, labels=ylabels)
xlabels <- c(100, 200, 300, 400)
xAt <- log(xlabels)
axis(1, at=xAt, labels=xlabels)

@mike1886 suggested to use the ggplot2 package. I had a look at it and what I found is quite promising. When one creates a ggplot one can add coord_trans() to transform the axes. There are a few transformations available but I couldn't find a log-log one. Fortunately one can also write a custom transformation with the trans_new() function from the scales package.
My code so far for the new transformation:

require(ggplot2)
require(scales)
loglog_trans <- function(){
  trans <- function(x){ log(log(x)) }
  inv <- function(x){ exp(exp(x)) }
  trans_new("loglog", trans, inv)
}

wpp <- ggplot(data2, aes(ftime, medianRank)) + geom_point()
wpp
wpp + coord_trans("log10", "loglog")

But it is not working.

Error in if (zero_range(range)) { : missing value where TRUE/FALSE needed In addition: Warning message: In log(log(x)) : NaNs produced

2
You should present some data and preliminary work. SO is not a do-my-project-for-me website.IRTFM
Added example data, explanation and what I did so far. Can't post the resulting plot because my reputation is not high enough.elevendollar

2 Answers

1
votes

You can try using ggplot2 (this is a very nice and complete plotting package) in R. For example, consider looking at the page: http://www.cookbook-r.com/Graphs/Axes_(ggplot2)/#axis-transformations-log-sqrt-etc

This will allow you to do whatever you would like to the axes. For example,

m <- qplot(rating, log10(votes), data=subset(movies, votes > 1000), na.rm = TRUE)
m + scale_y_log10() + scale_x_log10()
1
votes

I suspect that you are being expected to plot "complementary log -log" which probably means you are being asked to plot the log of the negative log. I admit that this is not exactly how such plots usually appear. What I usually see in texts regarding survival analysis is a rising trend and one should see roughly parallel lines (with positive slope) for log(-log(survival)) plotted against time when the proptional hazards assumption is met.

 dat <- read.table(text="             X cens survCoun
        136.5424    1        10   0.09090909
        181.9756    1         9   0.18181818
        192.4309    1         8   0.27272727
        216.6145    1         7   0.36363636
        224.3097    0         6   NA
        254.4997    0         5   NA
        285.1438    1         4   0.49090909
        289.3991    1         3   0.61818182
        295.9161    0         2   NA
        309.9522    0         1   NA", header=TRUE)

 with( dat, plot( log(X), log( - log(medianRank) ) ) )

enter image description here

So consider this where I am taking survCount/10 to be the proportion surviving at time= X:

png(); with( dat, plot( log(X), 
                        log( - log(survCount/max(survCount) ) ) 
                       ) )
dev.off()

enter image description here