Survival analysis: Cox proportional hazards model

TODO

  • link to survivalKM, survivalParametric

Install required packages

survival

Simulated right-censored event times with Weibull distribution

Time independent covariates with proportional hazards

\(T = (-\ln(U) \, b \, e^{-\bf{X} \bf{\beta}})^{\frac{1}{a}}\), where \(U \sim \mathcal{U}(0, 1)\)

Survival data in counting process (start-stop) notation.

Fit the Cox proportional hazards model

Assumptions

\[ \begin{equation*} \begin{array}{rclcl} \ln \lambda(t) &=& \ln \lambda_{0}(t) + \beta_{1} X_{1} + \dots + \beta_{p} X_{p} &=& \ln \lambda_{0}(t) + \bf{X} \bf{\beta}\\ \lambda(t) &=& \lambda_{0}(t) \, e^{\beta_{1} X_{1} + \dots + \beta_{p} X_{p}} &=& \lambda_{0}(t) \, e^{\bf{X} \bf{\beta}}\\ S(t) &=& S_{0}(t)^{\exp(\bf{X} \bf{\beta})} &=& \exp\left(-\Lambda_{0}(t) \, e^{\bf{X} \bf{\beta}}\right)\\ \Lambda(t) &=& \Lambda_{0}(t) \, e^{\bf{X} \bf{\beta}} & & \end{array} \end{equation*} \]

Call:
coxph(formula = Surv(obsT, status) ~ X + IV, data = dfSurv)

        coef exp(coef) se(coef)      z        p
X    0.49304   1.63728  0.09374  5.260 1.44e-07
IVB -0.82245   0.43935  0.21081 -3.901 9.56e-05
IVC  0.37670   1.45746  0.19336  1.948   0.0514

Likelihood ratio test=51.62  on 3 df, p=3.617e-11
n= 180, number of events= 157 

Use counting process data format.

Assess model fit

AIC

[1]    3.000 1367.667

Model comparisons

Restricted model without factor IV

Analysis of Deviance Table
 Cox model: response is  Surv(obsT, status)
 Model 1: ~ X
 Model 2: ~ X + IV
   loglik  Chisq Df P(>|Chi|)    
1 -698.77                        
2 -680.83 35.867  2 1.628e-08 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Estimate survival-function und cumulative baseline hazard

Survival function

For average pseudo-observation with covariate values equal to the sample means. For factors: means of dummy-coded indicator variables.

Call: survfit(formula = fitCPH)

      n  events  median 0.95LCL 0.95UCL 
    180     157      14      11      19 
25 50 75 
 4 14 39 

More meaningful: Estimated survival function for new specific data

plot of chunk rerSurvivalCoxPH01
plot of chunk rerSurvivalCoxPH01

Cumulative hazard

  time  cum_haz.1  cum_haz.2
1    1 0.02700124 0.03935325
2    2 0.05287319 0.07706061
3    3 0.09865382 0.14378411
4    4 0.12645421 0.18430209
plot of chunk rerSurvivalCoxPH02
plot of chunk rerSurvivalCoxPH02

Model diagnostics

Proportional hazards assumption

Complementary log-log plot for categorized predictor X.

plot of chunk rerSurvivalCoxPH03
plot of chunk rerSurvivalCoxPH03
plot of chunk rerSurvivalCoxPH03
plot of chunk rerSurvivalCoxPH03

Test based on scaled Schoenfeld-residuals

       chisq df    p
X      0.929  1 0.34
IV     3.062  2 0.22
GLOBAL 5.037  3 0.17

Plot scaled Schoenfeld-residuals against covariates

plot of chunk rerSurvivalCoxPH04
plot of chunk rerSurvivalCoxPH04

Influential observations

plot of chunk rerSurvivalCoxPH05
plot of chunk rerSurvivalCoxPH05

Linearity of log hazard

Plot martingale-residuals against continuous covariate with a nonparametric LOESS-smoother

plot of chunk rerSurvivalCoxPH06
plot of chunk rerSurvivalCoxPH06

Predicted hazard ratios

Hazard ratio relative to an average pseudo-observation with covariate values equal to the sample means. For factors: means of dummy-coded indicator variables.

 [1] 1.9166403 1.5389801 1.3210722 0.8617067 2.2969713 0.8735328 3.4527637
 [8] 2.5002058 1.0455378 0.7079919

Estimated global survival function.

  time      surv
1    1 0.9269729
2    2 0.8643413
3    3 0.7681378
4    4 0.7173365

Estimated survival function per group.

  time      IV.A      IV.B      IV.C
1    1 0.9265656 0.9633824 0.8909708
2    2 0.8626937 0.9299972 0.8003328
3    3 0.7630677 0.8746553 0.6666905
4    4 0.7097748 0.8431836 0.5990511

Further resources

Detach (automatically) loaded packages (if possible)

Get the article source from GitHub

R markdown - markdown - R code - all posts