Good Analytical Practice or: What the Medical Product Industry Gets Right

By Richard Zink posted 04-12-2016 07:52

  

Statisticians in the medical product industry (pharmaceuticals, biologics, devices and diagnostics) operate within a highly-regulated work environment. Because of this, there are scores of standard operating procedures and guidances that specify how studies should be designed, completed, analyzed and reported. For example, there are International Conference of Harmonisation (ICH) guidances for Good Clinical Practice (CGP) and Good Manufacturing Practice (GMP) which describe how we (should) monitor clinical trials and manufacture drug product, respectively [1,2]. But what about Good Analytical Practice?

Many would say that Good Analytical Practice is implied by ICH E9: Statistical Principles in Clinical Trials [3]. The details of our experiment (clinical trial) are finalized in a written protocol prior to beginning the study. Blinding of treatment assignment is recommended where possible. A written statistical analysis plan which details methodologies and data summaries is completed prior to database lock and unblinding. A small number of primary endpoints are specified with strict attempts to control for type I error. After unblinding analyses are performed and study reports are written, documenting any deviations to the original analysis plan. 

This is a great start. However, can ICH E9 be considered complete?

For example,

  1. The analysis is only as good as the data
    1. There is no discussion of the database lock process, such as what checks should be performed, who the signatories should be, and the implications of altering the database after it is considered final
    2. There are no specific details on data quality, though the GCP document is referenced. However, details for ongoing data quality assessments should probably be highlighted [4]
    3. There is limited discussion on data validation and what this entails. “Once data validation is complete, the analysis can proceed…”

       

  2. The analysis is only as good as the analysis
    1. There are data, such as those collected from case report forms or diaries (what would comprise CDISC SDTM domains), and there are analysis data that may involve windowing, imputation, or other complex derivations (what would comprise CDISC ADaM). There is really no mention of validation other than the statement above, and there doesn’t appear to be a distinction between these two kinds of data. However, as general practice in the industry, two sets of independent programming are performed to generate a given data set using pre-defined specifications (competing code validation). These final data sets are compared to identify differences in order to further refine the programming until no differences remain. Such rigor is important, since certain data or conditions may arise that may cause one (or both) implementation(s) to fail.
    2. Similar validation approaches for the analyses are often performed in practice. In other words, two sets of independent programming are performed to generate each table and figure. This is important for several reasons: complex derivations can often be performed in different ways with differing outcomes, statistical software often has numerous options available, and sometimes people make mistakes. What many may fail to realize, translating a written analysis plan into analysis code is often subject to some level of interpretation.
    3. Multiplicity is mentioned, but there are no best practices as to how to report it. For example, should only raw p-values be presented in the document with the appropriate alpha comparison left to the reader, or should multiplicity-adjusted p-values be presented?
    4. There is no discussion of good practices for data visualization.
    5. There is no discussion on the reporting of results. In my past experience in the industry, any document that went outside the company went through multi-disciplinary review with a validation of contents.

It may be worthwhile to consider what other important points should be summarized and included to define a single document on Good Analytical Practice. Going further, perhaps we should also review where the industry may fall short. For example, are treatment codes withheld from the study team for open-label trials? Is the primary endpoint withheld from the study team for a single arm trial? These are good practices to limit potential bias. In general, however, the medical product industry gets a lot of things right for Good Analytical Practice, but there is always room for improvement. 

But what of other research? I’d like to propose that there is value in defining basic Good Analytical Practice for other industries and areas of research. For example, the American Statistical Association (ASA) recently released a statement and numerous commentaries (see the supplemental information) on the use and interpretation of p-values [5]. While this document, or something like it, has been needed for quite some time, the outright ban of p-values in one journal is likely responsible for bringing this topic to the forefront [6]. Encouraging greater use and putting greater emphasis on confidence intervals is one important recommendation.

Other analytical practices that should see wider use in other areas of research include independent validation of data sets and analyses. Two recent examples are highlighted in [7]. The Nevins-Potti example refers to a recent microarray study of chemosensitivity conducted at Duke University and highlights the benefit of independent analysis to uncover misconduct [8]. However, this independent analysis was only possible because the data was in the public domain (and initially, the findings went ignored). The Reinhart-Rogoff controversy involved economic austerity policies. Not only were there errors in the analysis (which validation could have identified), but “unconventional weighting” contributed heavily to the final results [7]. (Another important lesson: report all analysis assumptions.)

Many may argue that there are insufficient resources for this additional work, but some progress can be made. For example, independent validation of data and analyses for primary endpoints and hypotheses is a good first step. A statement that such validation occurred would provide additional confidence in the accuracy of reported results and conclusions. (But how to verify? That’s a good question.) Similar approaches can be applied to any findings that are contrary to current understanding.

I am grateful to be part of an industry with such strong emphasis on Good Analytical Practice. But it is important to extend these practices to other industries and areas of research as described above. Ideally, this will lead to higher quality research in the literature and fewer retracted articles [9,10]. Research findings can have huge implications in economic (Reinhart-Rogoff) and medical practice. Take, for example, the fraudulent 1998 paper published in The Lancet by Andrew Wakefield and associates linking autism to the MMR vaccine [11]. (Another important lesson: always have controls.) This manuscript has contributed to reduced vaccination rates and the re-emergence of diseases long thought to be eradicated. While this particular paper described a small case study, it does emphasize the important role journal editors and reviewers have for Good Analytical Practice. Caroline White of the British Medical Journal provides details on a case of suspected research fraud and recommendations to journals for addressing it [12]. (Another important lesson: when in doubt, ask to see the data.)

Open questions for comments:

  1. Where does the medical product industry fall short for Good Analytical Practice?
  2. What topics are must-have in the (currently fictitious) Good Analytical Practice guideline?

 

References

  1. International Conference of Harmonisation. (1996). E6: Good Clinical Practice.
  2. International Conference of Harmonisation. (2000). Q7: Good Manufacturing Practice for Active Pharmaceutical Ingredients.
  3. International Conference of Harmonisation. (1998). E9: Statistical Principles in Clinical Trials.
  4. US Food and Drug Administration. (2013). Guidance for Industry Oversight of Clinical Investigations - A Risk-Based Approach to Monitoring.
  5. Wasserstein RL & Lazara NA. (2016). The ASA's statement on p-values: context, process, and purpose. The American Statistician, DOI:10.1080/00031305.2016.1154108.
  6. Trafimow D&Marks M. (2015). Editorial. Basic and Applied Social Psychology. 37: 12.
  7. Irizarry R, Peng R & Leek J. (2013, Apr 21). Nevins-Potti, Reinhart-Rogoff. Simply Statistics.
  8. Baggerly KA & Coombes KR. (2009). Deriving chemosensitivity from cell lines: Forensic bioinformatics and reproducible research in high-throughput biology. The Annals of Applied Statistics 3: 1309-1334.
  9. Grieneisen ML & Zhang M. (2012). A comprehensive survey of retracted articles from the scholarly literature. Public Library of Science (PloS) ONE 7(10): 1-15.
  10. Ioannidis JPA. (2005). Why most published research findings are false. PLoS Med 2(8): E124. DOI: 10.1371/journal.pmed.0020124.
  11. Godlee F, Smith J & Marcovitch H. (2011). Wakefield’s article linking MMR vaccine and autism was fraudulent. British Medical Journal 342: 64-66.
  12. White C. (2005). Suspected research fraud: Difficulties of getting at the truth. British Medical Journal 331: 281–288.
3 comments
379 views

Permalink

Comments

06-01-2016 09:31

Posting these comments on behalf of Winfried Koch, Biostatistician at BDS Koch.
Two sets of independent programming is a reasonable standard measure to ensure the quality and correctness of analysis programs. A limitation from my perspective is that the two programmers are often not truly independent but more or less “correlated” based on similarities in programming education, statistical understanding of the task, working style and tools, time pressure and others. From my experience there may be a danger, that the same error of misunderstanding of a statistical concept may be committed by both. A higher degree of independence might be achieved e.g. if one analysis would be programmed in SAS by a statistical programmer and the second analysis would be done by a statistician or data analyst interactively in JMP and then the results compared.
In clinical trials there is a relatively high standard in confirmatory analyses based on guidelines and internal SOPs but to my perception a lack of use of additional exploration and statistical modelling. For such exploratory analyses there is a high need of validation if the results will be used; maybe good working practices for working with JMP and other interactive programs would be helpful.
I look forward towards the continuation of your initiative.

04-22-2016 17:32

Thank you very much Richard for bringing up the discussion of Good Analytical Practice Guideline, well written and helpful.
I think it’s a good idea to have a Good Analytical Practice Guideline, which will provide good and consistent analytical practice across industries. However, a lot of thinking and diligence need to be given to plan and draft such a guideline. Many of the companies have their own SOPs for most of the issues you mentioned in your blog, a cross-industry collaboration may be helpful in order to capture the best practices. One has to be careful so that the guideline does not provide conflicting information with the existing guidelines (e.g. GCDMP as mentioned by Eric Pulkstenis).
Some of the topics that I think will be helpful to start with:
• Database lock process (e.g. define the show-stoppers for DBL, signatories, when to consider a database snapshot vs formal lock etc.)
• Open label study – how and when sponsor can maintain the blind internally? Sometimes it is not possible to maintain the blind if visit schedules are different between treatment groups.
• p- values interpretation and presentation of multiplicity-adjusted p-values
• Validation – I think most of the companies are pretty consistent in the dataset and table/listing validation process, i.e. use of two sets of independent programming. However, it is also important to have a validation process of the programming specification document, which will be used by both independent programmers.

04-12-2016 10:41

Interesting post Richard, just two comments. To the point about data quality the Society for Clinical Data Management has published Good Clinical Data Management Practices (GCDMP) (http://www.scdm.org/sitecore/content/be-bruga/scdm/Publications/gcdmp.aspx) which is a great resource for determining what ‘good’ may look like in the CDM space. I suspect development of good analytical practices will relate to risk tolerance. What one person finds ‘good enough’ others may not be comfortable with and in some cases may be gray rather than black and white. A most interesting discussion you have started!