Bridging Survival Analysis and Machine Learning to Improve Healthy Life Expectancy Estimation using PHR Records
Published in npj Digital Medicine, 2026

Healthy Life Expectancy (HLE) considers the years an individual has lived free of disease. Minimising time a population spends in ill-health will directly impact national healthcare budgets, alongside individuals’ personal quality of life. We utilise Personal Health Records (PHR), consisting of Electronic Health Records with lifestyle and wellness data, to investigate loss of healthy life. We leveraged Survival Analysis (SA) to directly estimate HLE without the Sullivan Method. Additionally, a multiple imputation ensemble Machine Learning (ML) model was trained to successfully predict the loss of healthy life within one year with an AUPRC of almost double that of random of the unbalanced data. ML explainability techniques enabled investigation of the model’s learned relationships, providing insights about the effects of lifestyle on maintaining healthy life. Finally, a novel method of combining the ML model’s prediction with a SA’s estimated hazards was proposed, enabling the formulation of a tailored conditioned Survival Function.
