1
votes

I want to write code for calculating hazard rate using coxph for a dataset. This data has 5 variables, 2 of them are used in Surv(), and two of them are used as covariates. Now I can write the function which can simply calculate hazard rate for two covariates after input dataname. However, when I want to calculate hazard ratio using same function for 3 covariates, the program said "run out of iterations and did not converge or more coefficients maybe infinite",and the result contains all five variables as covariates (which should be three). Here is my code, can anyone correct it? Thanks!

library(KMsurv)
library(survival)
data(larynx)
larynx2 = larynx[,c(2,5,1,3,4)]
larynx2$stage = as.factor(larynx2$stage)
mod = function(dataname){
    fit = coxph(Surv(dataname[,1],dataname[,2]) ~ ., data = dataname, ties = "breslow")
    return(list(result = summary(fit)))
}
mod(larynx2)
1
(a) You're missing a comma after your formula. (b) Does it work if you use the actual column names inside Surv() instead of dataname[, 1] and dataname[, 2]? There could be something weird happening since you are using . in the formula, using the data frame in the formula, and using the data argument. - Gregor Thomas
Thanks for your reply! I'm sorry that I missed a comma in my formula, but that is not the point. If using the actual name, that works, but I want to write a formula which can change the filename. Once you input a dataname, the function will automatically show the hazard ratio result. So I cannot use actual column name in the function. - JohnSun

1 Answers

1
votes

How about this? Since column names in the formula works, we build the formula dynamically using the column names:

mod = function(dataname) {
    form = as.formula(sprintf("Surv(%s, %s) ~ .", names(dataname)[1], names(dataname)[2]))
    fit = coxph(form, data = dataname, ties = "breslow")
    return(list(result = summary(fit)))
}

mod(larynx2)
# $result
# Call:
# coxph(formula = form, data = dataname, ties = "breslow")
# 
#   n= 90, number of events= 50 
# 
#            coef exp(coef) se(coef)      z Pr(>|z|)    
# stage2  0.15078   1.16275  0.46459  0.325   0.7455    
# stage3  0.64090   1.89820  0.35616  1.799   0.0719 .  
# stage4  1.72100   5.59012  0.43660  3.942 8.09e-05 ***
# age     0.01855   1.01872  0.01432  1.295   0.1954    
# diagyr -0.01923   0.98096  0.07655 -0.251   0.8017    
# ---
# Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# 
#        exp(coef) exp(-coef) lower .95 upper .95
# stage2     1.163     0.8600    0.4678     2.890
# stage3     1.898     0.5268    0.9444     3.815
# stage4     5.590     0.1789    2.3757    13.154
# age        1.019     0.9816    0.9905     1.048
# diagyr     0.981     1.0194    0.8443     1.140
# 
# Concordance= 0.676  (se = 0.039 )
# Rsquare= 0.182   (max possible= 0.988 )
# Likelihood ratio test= 18.13  on 5 df,   p=0.003
# Wald test            = 20.87  on 5 df,   p=9e-04
# Score (logrank) test = 24.4  on 5 df,   p=2e-04