ASA Connect

 View Only
  • 1.  Propensity Score Matching When Some Data is Missing

    Posted 12-30-2021 09:53
    Dear ASA Community:

    In early 2022, I will be working on a propensity score matching project.  The population is people 65+ on Medicare.
    I want to match trauma patients with the general Medicare population on
    age, race/ethnicity, sex, out-of-pocket medical spending in initial year, self-rated health in initial year.

    The challenge is that ~10% of out-of-pocket spending and self-rated health are missing.
    I am thinking of creating quartiles for out-of-pocket spending and creating another category for unknown.
    For self-rated health, categories for good/excellent, fair/poor, and unknown.

    After the matched sample has been created, compare out-of-pocket medical spending and self-rated health over a 4-year period.
    Then, repeat the analysis after imputing the missing data with multiple imputation.

    Does anyone know of articles that provide insight on whether this is a sound statistical analysis plan?

    If the missing data were imputed with multiple imputation before the propensity score matching, seems to me that the matching could get messy.

    ------------------------------
    Brandy Sinco, BS, MA, MS
    Statistician Senior
    Michigan Medicine
    ------------------------------


  • 2.  RE: Propensity Score Matching When Some Data is Missing

    Posted 12-31-2021 07:17
    Brandy:

    Conveniently I've just done a (very modest) lit search on this topic, which I share below:

    D'Agostino and Rubin (2000) suggested an EM algorithm with a general location (GL) model to obtain estimates of the missing propensity scores conditional on all data that were then used in a matching approach to obtain treatment effect; no attempt was made to account for uncertainty and no simulations were conducted. Mitra and Reiter (2011) extended this approach using a latent mixture GL model and multiple imputation.   Qu and Lipkovich (2009) multiply imputed based on covariates only but then interact the imputation models by missing data patterns to provide a partial correction for NMAR setting.  Granger et al. (2019) and Leyrat et al. (2019) independently showed that multiple imputation needs to impute the propensity scores and then conduct a full MI analysis (what they term MI-within), versus simply taking the mean of the multiply imputed propensity scores and using them in a single analysis (MI-across), while Leite et al. (2021) argued that in practical settings with modest amount of missing data, MI-across or even single imputation could be adequate.  D'Agostino and Rubin, Mitra and Reiter, and Granger et al. all allowed missingness to be a function of the outcome, and included it in imputation in their simulation and analysis; Qu and Lipkovich and Leite et al. assumed missing dependent only on the propensity covariates and ignored the outcome in the imputation.

    I do think this is a somewhat understudied area, although others might point to work I've missed.

    References

    D'Agostino Jr, R. B., and Rubin, D. B. (2000). Estimating and using propensity scores with partially missing data. Journal of the American Statistical Association, 95, 749-759.

    Granger, E., Sergeant, J. C., and Lunt, M. (2019). Avoiding pitfalls when combining multiple imputation and propensity scores. Statistics in Medicine, 38, 5120-5132.

    Leite, W. L., Aydin, B., & Cetin-Berber, D. D. (2021). Imputation of Missing Covariate Data Prior to Propensity Score Analysis: A Tutorial and Evaluation of the Robustness of Practical Approaches. Evaluation Review, in press.

    Leyrat, C., Seaman, S. R., White, I. R., Douglas, I., Smeeth, L., Kim, J., ... & Williamson, E. J. (2019). Propensity score analysis with partially observed covariates: How should multiple imputation be used?. Statistical Methods in Medical Research, 28, 3-19.

    Mitra, R., and Reiter, J. P. (2011). Estimating propensity scores with missing covariate data using general location mixture models. Statistics in Medicine, 30, 627-641.

    Qu, Y., and Lipkovich, I. (2009). Propensity score estimation with missing values using a multiple imputation missingness pattern (MIMP) approach. Statistics in Medicine, 28, 1402-1414.



    ------------------------------
    Michael Elliott
    University of Michigan
    ------------------------------