Discussion: View Thread

  • 1.  JSM Toronto - Discussant needed for Topic Contributed Session -

    Posted 07-26-2023 15:11

    Dear colleagues, 

    We need a volunteer for discussant at our Topic contributed session in Toronto. Our discussant had an unexpected urgent family matter and is unable to attend the JSM in Toronto. 

    - The  Session description in the online program

    https://ww2.aievolution.com/JSMAnnual/index.cfm?do=ev.viewEv&ev=2268

    and cut/pasted

    Protected Health Information Privacy, and Statistical Disclosure Considerations for Projects using HIPAA


     to be filled  - discussant
    Chris Barker Organizer
     
    Thursday, Aug 10: 10:30 AM - 12:20 PM
    1713 
    Topic-Contributed Paper Session 

    Presentations

    A Practical Introduction to Privacy Enhancing Technologies: From the Millionaire's Problem to Modern Data Disclosures

    Privacy Enhancing Technologies (PETs) are a broad set of technologies that allow data owners, data scientists and statisticians to join and disseminate information with theoretical and computational guarantees. They limit who and to what extent can process personally or entity identifiable information. Famously in the community, these types of technologies have been able to perform computations such as the private set intersection of datasets and provable privacy disclosure controls using differential privacy, as has been seen by the US Census 2020. However, the range of PETs has never been so broad and accessible as they have in recent years. This year these technologies have been recommended and advocated by the likes of the IEEE, Royal Society and United Nations. In this presentation, we will cover a tour of the PETs landscape covering input and output privacy paradigms, the technology maturity and considerations for those planning to use PETs for statistics. Further, we will demonstrate accessible ways to start leveraging PETs in day-to-day statistical analysis. 

    Speaker

    Jack Fitzsimons, Oblivious Software Ltd

    De-identification and privacy: The role of the statistician as a data guardian

    Statistical methods for probabilistic fuzzy matching enables the de-identification of private healthcare information with relative accuracy. This presentation discusses the various algorithms for de-identification and compares the minimum and data types needed to achieve de-identification. Tools for circumvention are examined with respect to efficiency and counter activity. Machine learning versus probabilistic algorithms also are debated. 

    Speaker

    Jimmy Efird, Boston VA Cooperative Studies Program Coordinating Center

    Federated Learning via Random Forest: Personalized Treatment Effect Estimation from Heterogeneous Data Sources

    Accurately estimating personalized treatment effects within a study site (eg, a hospital) has been challenging due to limited sample size. Furthermore, privacy considerations and lack of resources prevent a site from leveraging subject-level data from other sites. We propose a tree-based model averaging approach to improve the estimation accuracy of conditional average treatment effects (CATE) at a target site by leveraging models derived from other potentially heterogeneous sites, without them sharing subject-level data. To our best knowledge, there is no established model averaging approach for distributed data with a focus on improving the estimation of treatment effects. Specifically, under distributed data networks, our framework provides an interpretable tree-based ensemble of CATE estimators that joins models across study sites, while actively modeling the heterogeneity in data sources through site partitioning. The performance of this approach is demonstrated by a real-world study of the causal effects of oxygen therapy on hospital survival rate and backed up by comprehensive simulation results. 

    Speaker

    Lu Tang, University of Pittsburgh

    Ethics, Misuse of Terms of Use of Federal Databases for De-anonymization of Patients

    I report my key finding - discovery of a mis-use of Terms of Use (TOU) of at least two Crown Jewel Federal Databasse to produce a public domain, not-password protected companion datasets resulting in an unprecedented first-ever publication of a "proof of concept" (POC) and algorithm for de-anonymizing individual data that appears in a prestigious peer reviewed economics journal. The algorithm and POC use characteristics of unique oncology clinical trials recorded in clinicaltrials.gov that create unprecedented risk of de-anonymization of forty (40) unique patients using "big data" -SEER (Surveillance and Epidemiology End Results) patient level tumor data and vice versa, use SEER to de-anonymize unique patients in "big data" clinicaltrials.gov. SEER is anonymized data. My finding of instances of patient level data in clinicaltrials.gov appears to be completely unexpected and may possibly be first ever reported to privacy experts at both SEER and clinicaltrials.gov.. My original report appears in the September 2022 issue of the AMSTAT (american statistical association) newsletter "Finding the De-Anonymization Needle in the SEER Haystack". 

    Speaker

    Chris Barker


    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------


  • 2.  RE: JSM Toronto - Discussant needed for Topic Contributed Session -

    Posted 07-26-2023 16:10

    thank you. We appreciate Mary Lwasny for volunteering as discussant. The ASA has updated the program to reflect the change



    ------------------------------
    Chris Barker, Ph.D.
    2023 Chair Statistical Consulting Section
    Consultant and
    Adjunct Associate Professor of Biostatistics
    www.barkerstats.com


    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    ------------------------------