thank you for the clarification and update. and have fun! .
Medication data, is a "near and dear to my heart" -typically very messy data in pharma. Having worked in health economics in big pharma, the economists often needed or asked for my help to work with concomitant medications ( as its called in pharma ) data for their health economics models.
Messy as concomitant medication data may be for clinical trials, it poses a challenge to summarize or use in any analysis. My effort to provide methods to both visualize and use messy medication data, I developed a method for summary of longitudinal (and messy) concomitant medication data -using the "mean cumulative function". Caveat Emptor - credit for formulating the original MCF methodology is to Bill Nelson, Doganaksoy et al (cits in my paper) .
The more recent elegant extensions of Nelson et al methods (not in my paper) is formulated using counting processes
Chris Barker, Ph.D.
"In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
Original Message:
Sent: 05-14-2024 13:33
From: Christopher Ryan
Subject: what are the limits of the statistical consultant's responsibilities?
Thanks for all your insights.
A few more operational details may be helpful for context. The medical practice writes prescriptions, and patients fill them at a pharmacy of their choosing. The researchers want to try, with the patient's consent, a different way of prescribing those same meds that would be prescribed anyway, but with a different quantity and number of refills. This pertains to a statewide effort to make the process easier for patients by prescribing more pills/more refills for these particular meds, to reduce the need for such frequent visits to medical practice to get new presciptions. However, there are questions about whether finances, insurance, and other logistical issues would interfere at the point the patient presents to their pharmacy for dispensing, and that's the focus of the research.
The medical practice would provide names and birthdates of the patients, how many pills and refills were prescribed, and to which pharmacy. Research team members would then call the pharmacies and ask, e.g. "Patient Samwise Gamgee, birthdate ______ was prescribed ___ amount of this medicine with ___ refills, on ________ date, intending to have it filled at your pharmacy. Did they pick them up? How many pills and how many refills did they actually receive?" The difference between the amount prescribed and the amount received is the target of inference/analysis (really just descriptive at this stage).
So the researchers will need ongoing, repeated access to PHI in a continuously-growing database (although not growing very fast; we don't expect a huge number). Phone calls made in real-time, so to speak. Hence my preference for the all data to be in REDCap (a highly-controlled and audited environment), and for the research students to be logged into it while making the phone calls. No data need be downloaded to anyone's computer.
Research students would be formally and temporarily made volunteers/workers with the medical practice (with all the vetting that entails), so they could legitimately tell the pharmacist, "I'm Albus Dumbledore calling on behalf of The Medical Practice . . . ." It's not uncommon for a medical practice to seek this sort of follow-up from pharmacies.
------------------------------
Christopher Ryan
Clinical Associate Professor of Family Medicine
SUNY Upstate Clinical Campus
Original Message:
Sent: 05-14-2024 06:35
From: Chris Barker
Subject: what are the limits of the statistical consultant's responsibilities?
Thank you - great question. One key point is that you have Protected Health Information (PHI). Coincidentally a few months ago the section had webinar about data privacy related matters. The slide sets should be available.
I believe you have a strong case to request to -collaborate- with the investigator and IRB and the researchers . And as to your responsibility it, may be to alert the IRB and the investigators that there may be a privacy problem. From what You've written I interpret that the IRB recognized an important privacy risk by asking for a scrambled identifier. ---- YOu may first want to ask the IRB if privacy is in fact their concern. If that is the IRB's concern, and because there is PHI, a serious privacy risk is the "de-identification" or "re-identification" of patients. possibly by accidental or deliberate linking of that data in the database to other information about those patients. Its not entirely clear why the users need the PHI. ( I will skip the distinction between de- and re- identification) , I'm guessing that Perhaps that is because the several different datasets need to be combined by unique patient level identifier. The privacy literature demonstrates that IRB"s suggestion of a seemingly simple "along with arbitrary, non-meaningful coded id number" are far less than perfect about preventing (de-) or (re-) identification of actual data. There are at least dozens of articles about how those methods do not protect privacy of patients. A well known example is the NETFLIX PRIZE ($$$$$) example, where anonymized data provided by netflix was "de-identified" by matching to the www.imbd.com database. And the authors of the paper identified individuals in sufficient detail and my understanding is they confirmed that with netflix. This is the paper that "de identified" netflix with theorem and proofs. https://www.cs.utexas.edu/~shmat/shmat_oak08netflix.pdf. The less technical description in WIRED https://www.wired.com/2007/12/why-anonymous-data-sometimes-isnt/
My simple as possible suggestion- again assuming--- that the PHI is needed "one time" only to combine data from separate files by a unique patient identifier You or a designated person prepare the combinations of the files. and then remove the PHI before providing the data to the users. And simply number each unique person using integers, 1, 2, nnn, N. That -does not- provide an absolute guarantee that individual patients can't be uniquely identified but removing the PHI should help.
And the users can prepare analyses they need without being concerned with accidental de-/-re identification. Also the CENSUS bureau has developed and published methodologies to protect citizen privacy for the census releases.
------------------------------
Chris Barker, Ph.D.
Past Chair
Statistical Consulting Section
Consultant and
Adjunct Associate Professor of Biostatistics
www.barkerstats.com
---
"In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
-Steve Lacy
Original Message:
Sent: 05-13-2024 13:33
From: Christopher Ryan
Subject: what are the limits of the statistical consultant's responsibilities?
I am working with a researcher and their team of about 6 people (students). They intend to abstract some information about prescriptions from medical records from a practice in the community, along with information about the filling of those prescriptions from the pharmacies where they are taken. The practice is a willing partner. My intended involvement is to build a REDCap database into which the abstracted information can be entered, and then to conduct the analysis with those data. Protected Health Information is necessary to conduct the research--at least to get the needed data; thereafter PHI is irrelevant.
I've had productive and rewarding collaborations with this researcher in the past and hope to continue in the future. I usually provide advice about several aspects of data management, including privacy and security issues.
The IRB to which the researcher is applying has suggested maintenance of a key, containing the patient identifiers, along with arbitrary, non-meaningful coded id numbers, with only the meaningless id numbers being recorded in REDCap. A common approach, of course. But I worry about they key file. Where will it be stored? If the student team members will be doing all their work *at* the medical practice, on a computer owned by the medical practice, I guess I could live with that. But the prospect of 6 copies of said spreadsheet holding key, floating around on 6 uncontrolled, unsecured, laptops (and maybe USB keys--yikes!) troubles me.
For those unfamiliar with it, REDCap is a secure, HIPAA compliant, web-based application that includes SSL connection, password sign-in, 2-factor authentication, very fine control over user permissions, and a robut audit trail/log of everything everyone does. Spreadsheets on laptops have none of this. I'm not entirely confident the IRB folks know much about REDCap, although I could be misreading that.
So my question: where do my responsibilities, as the statistical consultant, start and stop? If the IRB with jurisdiction says, "you can do it this way," and I disagree, then what?
Grateful for any perspectives.
------------------------------
Christopher Ryan
Clinical Associate Professor of Family Medicine
SUNY Upstate Clinical Campus
------------------------------