3
votes

The docs for Xgboost imply that the output of a model trained using the Cox PH loss will be exponentiation of the individual persons predicted multiplier (against the baseline hazard). Is there no way to extract from this model the baseline hazard in order to predict the entire survival curve per person?

survival:cox: Cox regression for right censored survival time data (negative values are considered right censored). Note that predictions are returned on the hazard ratio scale (i.e., as HR = exp(marginal_prediction) in the proportional hazard function h(t) = h0(t) * HR)

1

1 Answers

3
votes

No, I think not. A workaround would be to fit the baseline hazard in another package e.g. from sksurv.linear_model import CoxPHSurvivalAnalysis or in R by require(survival). Then you can use the predicted output from XGBoost as multiplyers to the fitted baseline. Just remember that if the baseline is on the log scale then use output_margin=True and add the predictions.

I hope the authors of XGBoost soon will provide some examples of how to use this function.