July 30, 2025
The authors present a new method for predictive performance assessment that works for non-parametric and semi-parametric models and can check with addition of a frailty term and also for smoothing. The predictive log-likelihood idea does not work well with semi-parametric and non-parametric models and additive models which have step functions, which involves inference made only at event times so predicting outside of those event times can lead to values of zero.
This problem has been well-known and several solutions are used in practice. The solution of Verweij and van Houwelingen (1993) was closest to the idea of predictive likelihood by replacing predictive likelihood with predictive partial likelihood. However, this solution is only suitable for Cox models, and not for general semi- and nonparametric contexts, which do not have an associated partial likelihood. Other approaches move away from likelihoods.
Verweij and van Houwelingen (1993) had developed a likelihood loss function to be able to handle these issues with predicting from step functions along with cross-validation. They then defined a cross-validated log-likelihood. However their method relies on partial likelihoods which don’t exist for non-parametric. To handle this problem, the authors proposed a solution involving the estimation of the hazard by smoothing the estimated cumulative hazard. A challenge of this approach is the choice of the bandwidth. They did not consider fixed bandwidths since that wouldn’t make sense in this context so they proposed an adaptive bandwidth approach. They considered a bandwidth that accounts for a certain fixed number of event times on each side of when estimating the hazard, and they recommended, from practical experience, m=5. Their approach uses a nearest neighbor strategy for bandwidth selection. They claim it was easy to see the proportional assumption is naturally satisfied in their new method and they only need to smooth the baseline cumulative hazard where the estimated cumulative baseline hazard Ĥ is a right continuous step function with finite jumps at defined by their definition of the baseline hazard rate. When they applied the smoothing idea to the baseline hazard, they obtained the estimated hazard and smoothed estimated hazard.
Their new smoothing method can easily fit into different types of semi- and nonparametric survival models, making it very flexible and widely useful. It does not have the information leak that the method of Verweij and Van Houwelingen (1993) has. Penalized additive hazards model is what they used for their novel application and they also included a penalty term to shrink the time-dependent regression coefficients. They tested their methods in two different simulations. In the first one they did variable selection in the Cox model. They found their method matched the results of the original method, the likelihood-based methods, and did better than brier score (BS) and the integrated brier score (IBS). In the second simulation they selected between frailty and the Cox model. They varied the frailty variance. Their method was more likely to select the frailty model as the variance increased whereas the IBS and BS scores showed more erratic model choice even selecting a frailty model when the frailty variance was low. They were also able to validate their methods in a real world data application.
Written by,
Usha Govindarajulu
Keywords: survival, predictive smoothed likelihood
References:
Lu C, Putter H, Girondo MR, and Goeman JJ (2025). “Model Validation for Survival Analysis by Smoothed Predictive Likelihood” Biometrical Journal. https://doi.org/10.1002/sim.70193
- J. M. Verweij and H. C. Van Houwelingen, “Cross-Validation in Survival Analysis,” Statistics in Medicine12, no. 24 (1993): 2305–2314.
https://onlinelibrary.wiley.com/cms/asset/6aff73b6-3d23-4e18-94ec-a02a7b1f2365/sim70193-fig-0001-m.jpg