ASA Connect

 View Only
  • 1.  Propensity score matching on Complex Survey Data

    Posted 09-16-2020 09:54
    Hi ASA Colleagues:

    Does anyone have recommendations for articles on propensity score matching with complex survey data?

    These are on my current reading list and I'd appreciate suggestions for other articles.

    Austin, Jembere, and Chiu (2018). Propensity score matching and complex surveys.

    DuGoff, Schuler, Stuart (2014). Generalizing Observational Study Results: Applying Propensity Score Methods to Complex Surveys.

    Karabon (2019). Applying Propensity Score Methods to Complex Survey Data Using SAS PROC PSMATCH.


    ------------------------------
    Brandy Sinco, BS, MA, MS
    Statistician Senior
    Michigan Medicine
    ------------------------------


  • 2.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 08:25

    Dear Brandy,

    Here are a few others; not all on matching (some are on weighting) but the ideas are similar.

    Ridgeway et al. (2015, Journal of Causal Inference); https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5802372/

    Lenis, Ackerman, and Stuart (2018, Computational Statistics and Data Analysis):  https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6034692/

    Lenis et al. (2019, Biostatistics):  https://pubmed.ncbi.nlm.nih.gov/29293896/

    I'm sure there are others too!

    Thanks,
    Liz



    ------------------------------
    Elizabeth Stuart
    Professor, Associate Dean for Education
    Johns Hopkins, Bloomberg School of Public Health
    ------------------------------



  • 3.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 08:48
    Brandy,

    what is your application? There's scores of work where PSM is used to combine probability samples with non-probability samples; the earliest was arguably Sunghee Lee dissertation work in Michigan (JOS 2006; https://www.scb.se/contentassets/ca21efb41fee47d293bbee5bf7be7fb3/propensity-score-adjustment-as-a-weighting-scheme-for-volunteer-panel-web-surveys.pdf), see also https://doi.org/10.1177/0049124108329643 , https://doi.org/10.1214/16-STS598, https://doi.org/10.1080/01621459.2019.1677241 for more recent work, and there are of course others.

    If your goal is really treatment effect estimation with complex survey data, in my mind as a design-based survey statistician, it is VERY complicated. If you compute pairwise differences on matched pairs, in the framework of population sampling you need to weight them with pairwise selection probabilities, and (1) no survey data set comes with them, (2) it is a bit unclear what the population quantity is given that we are talking about hypothetical metapopulation potential outcomes. Once you lose the probability spaces, you are left with algorithms -- you can evaluate their performance with Monte Carlo, but whether any sort of a central limit theorem applies is anybody's telling.

    You might also want to ask this specifically in the SRMS section community list if you are a member.

    ------------------------------
    Stanislav Kolenikov
    Principal Scientist
    Abt Associates
    ------------------------------



  • 4.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 09:25
    Hi  Elizabeth and Stanislav,

    My application is using that National Health Interview Survey (NHIS).  A physician has asked me to match adults hospitalized for traumatic injuries to the general adults population, matched for gender, race/ethnicity, age group, and region of the country.  I am exploring how to use propensity score matching to match 1-to-5 people in the general adult population in the NHIS with each trauma surgery patient.  I.E., all data will come from the NHIS.  I am also interested in how to use coarsened exact matching with complex survey data.

    I am an ASA member, but am not currently a member of the SRMS section, but will consider joiining.


    ------------------------------
    Brandy Sinco, BS, MA, MS
    Statistician Senior
    Michigan Medicine
    ------------------------------



  • 5.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 10:12
    Where do the adults hospitalized due to trauma come from? Is that your physician's sample / patients list?

    Since you have only a handful of categorical variables, you'd be able to just match them as is -- append / row_bind the data, sort / group_by by these variables, and voila.

    What is the end goal of the analysis?

    In our group, we did similar work a few years back: http://www.aapor.org/AAPOR_Main/media/AnnualMeetingProceedings/2015/H1-4-Burkey.pdf. (Well we do a fair amount of this but that's when we got to present it.)

    ------------------------------
    Stanislav Kolenikov
    Principal Scientist
    Abt Associates
    ------------------------------



  • 6.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 15:10
    Patients hospitalized due to traumatic injury are identified via a query to the "Injured or Poisoned" section of the National Health Interview Survey (NHIS).  Everyone in my dataset is from the NHIS.

    ------------------------------
    Brandy Sinco, BS, MA, MS
    Statistician Senior
    Michigan Medicine
    ------------------------------



  • 7.  RE: Propensity score matching on Complex Survey Data

    Posted 09-17-2020 10:27
    Brandy,

    DuGoff et al (2014) have a great section in their paper where they describe 28 studies with some kind of survey weights, that use propensity score methods, and yet approach the problem in a variety of ways.
    • 16 studies ignored the weights completely (while claiming representativeness)
    • 7 used the weights only in the outcome model
    • 5 used the weights in both the propensity score and outcome model
    Clearly there's confusion on how to handle survey weights with propensity score analysis. While it doesn't address complex survey design generally (e.g. cluster sampling), some colleagues and I showed mathematically how you should incorporate sampling weights into a propensity score analysis.

    G. Ridgeway, S. Kovalchik, B.A. Griffin, and M.U. Kabeto (2015). "
    Propensity score analysis with survey weighted data," Journal of Causal Inference 3(2):237-249.

    The primary conclusion is that you should use sampling weights in the propensity score estimation stage (as weights, not as a covariate), compute final weights as the product of the sampling weight and the propensity score weight, and use those final weights in an outcome model.

    You will find papers that claim that you do not need to use the sampling weights in the propensity score estimation stage. You will also note that those same papers have no mathematics supporting their claims... just simulation or a particular applied example in which it turns out not to matter. We point out three specific cases that will cause problems if you do not use the sampling weights in the propensity score estimation stage:
    1. If there is a covariate <inline-formula><alternatives>z</alternatives></inline-formula> used in the sampling weights that is not used or available for the propensity score model even if <inline-formula><alternatives>z</alternatives></inline-formula> is independent of the potential outcomes
    2. If the propensity score model has limited degrees of freedom and spends those degrees of freedom on the domain of pretreatment covariates x with small sampling weights
    3. If the sampling probability depends on treatment assignment, particularly for the case when treatment and control cases are drawn from different survey efforts or different survey waves
    David Lenis has some recent papers too that address other specific issues
    https://doi.org/10.1016/j.csda.2018.05.003
    https://doi.org/10.1093/biostatistics/kxx063

    Greg

    ------------------------------
    Greg Ridgeway
    Professor and Chair, Department of Criminology
    Professor, Department of Statistics
    University of Pennsylvania
    ------------------------------