Title: A Unified Framework for Inference with General Missingness Patterns and Machine Learning Imputation
Speaker: Zhenke Wu, Ph.D.
Associate Professor of Biostatistics and Global Public Health
University of Michigan, Ann Arbor
Time: Thursday, June 11, 12:00 PM – 1:00 PM, E.T.
Registration for free via link: https://events.teams.microsoft.com/event/239bc4bc-88fc-4ec1-9934-f4b7426190d9@e51cdec9-811d-471d-bbe6-dd3d8d54c28b
Abstract: Pre-trained machine learning (ML) predictions have been increasingly used to complement incomplete data to enable downstream scientific inquiries, but their naive integration risks biased inferences. Recently, multiple methods have been developed to provide valid inference with ML imputations regardless of prediction quality and to enhance efficiency relative to complete-case analyses. However, existing approaches are often limited to missing outcomes under a missing-completely-at-random (MCAR) assumption, failing to handle general missingness patterns (missing in both the outcome and exposures) under the more realistic missing-at-random (MAR) assumption. This paper develops a novel method that delivers a valid statistical inference framework for general Z-estimation problems using ML imputations under the MAR assumption and for general missingness patterns. The core technical idea is to stratify observations by distinct missingness patterns and construct an estimator by appropriately weighting and aggregating pattern-specific information through a masking-and-imputation procedure on the complete cases. We provide theoretical guarantees of asymptotic normality of the proposed estimator and efficiency dominance over weighted complete-case analyses. Practically, the method affords simple implementations by leveraging existing weighted complete-case analysis software. Extensive simulations are carried out to validate theoretical results. A real data example is provided to further illustrate the practical utility of the proposed method. The paper concludes with a brief discussion on practical implications, limitations, and potential future directions.
Speaker Bio: Zhenke Wu, PhD, is a tenured Associate Professor of Biostatistics and of Global Public Health at the University of Michigan. Dr. Wu's research focuses on AI and Statistics for affordable and individualized healthcare, specifically advancing computational and interventional digital health. His methodological expertise spans structured Bayesian latent variable modeling, causal inference, and reinforcement learning for sequential decision-making, frequently utilizing dynamic data from wearable devices and mobile health platforms. He develops widely accessible open-source statistical tools which have supported critical public health initiatives in low- and middle-income countries. Dr. Wu earned his PhD in Biostatistics from Johns Hopkins University and a BS in Mathematics from Fudan University.
Host and Contact: Yang Shi, Karmanos Cancer Institute (yangsh AT karmanos.org)
------------------------------
Yang Shi
Wayne State University
------------------------------