Shuo Chen, PhD

September 11, 2025 Webinar

Unbiased machine learning regression models for biological age

Shuo Chen, PhD

 

Abstract: 

Machine learning models for continuous outcomes often yield systematically biased predictions, particularly for values that largely deviate from the mean. Specifically, predictions for large-valued outcomes tend to be negatively biased (underestimating actual values), while those for small-valued outcomes are positively biased (overestimating actual values). We refer to this linear central tendency warped bias as the "systematic bias of machine learning regression". This bias can lead to the well-known bias in biological age (e.g., brain age and epigenetic clock) estimation. In this paper, we first demonstrate that this systematic prediction bias persists across various machine learning regression models, and then delve into its theoretical underpinnings. To address this issue, we propose a general constrained optimization approach designed to correct this bias and develop computationally efficient implementation algorithms. Simulation results indicate that our correction method effectively eliminates the bias from the predicted outcomes. We apply the proposed approach to the prediction of brain age using neuroimaging data. In comparison to competing machine learning regression models, our method effectively addresses the longstanding issue of "systematic bias of machine learning regression" in neuroimaging-based brain age calculation, yielding unbiased predictions of brain age.

Short Bio:

Dr. Chen is an MPower Professor of Biostatistics and Bioinformatics, School of Medicine, University of Maryland. He graduated from Emory University with a Ph.D. degree in Biostatistics, working with DuBois Bowman on his dissertation. His research interest is to develop statistical and ML/AI models to handle complex medical imaging, imaging-genetics, and multi-omics data to improve inference/prediction accuracy and replicability of findings across labs.

Hope to see you at the webinar!