ASA Connect

 View Only
Expand all | Collapse all

Sorry, wrong number: Statistical benchmark comes under fire

  • 1.  Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-17-2019 18:21
    A familiar type of story in the news, mentioning in the ASA.

    They have a fun quote "inside the arcane world of statistics"!
    I've always thought we were a pretty cool and mysterious lot.


    Sorry, wrong number: Statistical benchmark comes under fire

    By MALCOLM RITTER AP Science Writer


    https://www.miamiherald.com/news/article237281119.html



    ------------------------------
    Glen Wright Colopy
    DPhil Oxon
    Data Scientist at Cenduit LLC, Durham, NC
    ------------------------------


  • 2.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-22-2019 16:26
    Glen,

    Thanks for posting this.

    This was a great article for many reasons, especially educating those of us not in the pharmaceutical industry. It puts their "bet the ranch" emphasis on significance testing in proper financial decision-making perspective. To anyone already familiar with highly regulated industries this article might seem like a "no-brainer." But the instant write down of a major financial investment by $1 Billion on the discovery of a p-value equal to 0.059, when 0.05 was required for FDA approval, is a "Wait ... what just happened?" moment for the rest of us. Typically, the financial industry's rules are considered the gold standard for major arcana, so to be included in that cabal could be considered the equivalent of being "way cool dude."

    Tom

    Thomas D. Sandry, PhD
    Industrial Statistical Consultant, Retired

    ------------------------------
    Thomas Sandry
    ------------------------------



  • 3.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-25-2019 09:02
    I have not been in the world of clinical trials for many years, but this article is disturbing.  If (and it's a big if) the study was designed to adequately measure a clinically (not just statistically) significant result, wouldn't a better path forward be to acknowledge that the results are promising but may require additional testing?  Maybe I'm being too simplistic, but therapeutic agents that benefit patients are not common and are of great value.​

    ------------------------------
    Morris Olitsky
    Statistician
    USDA
    ------------------------------



  • 4.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-26-2019 06:47

    Sharing here what I wrote on LinkedIn, which is a good channel for sharing these concerns with a wider professional but non-statistical community. 

    A good article on the perils of p-values, written in a clear, accessible manner. This is a good article to share with non-statisticians who need to use statistical results - business managers, engineers, physicians and other medical technicians, attorneys and other legal professionals, and so on. The article is excellent, but I could wish it added one more thing more: P-Hacking. In the article, a study is deemed a failure because it just missed an arbitrary significance level of 0.05. One risk we find in these cases - something people should watch out for - is repeating experiments or other tests over and over until one, by chance, barely achieves the 0.05 level, which is unwisely regarded as a gold standard.

    My colleague Eric Vance mentioned the xkcd cartoon on p-hacking (jelly beans and acne) as good explanation of p-hacking for people who aren't statistical experts. 



    ------------------------------
    David J Corliss, PhD
    Director, Peace-Work www.peace-work.org
    davidjcorliss@peace-work.org
    ------------------------------



  • 5.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-26-2019 18:13
    Hello All,
     
    I have emailed Malcolm Ritter, author of the subject AP article, requesting any additional information about the nature of the drug trial and its outcome which he could supply while maintaining the confidence and privacy of the sources of his story.  I'll post whatever he's willing and able to share publicly.

    Tom

    Thomas D. Sandry, PhD
    Industrial Statistical Consultant, Retired

    ------------------------------
    Thomas Sandry
    ------------------------------



  • 6.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 11:12
    Here's a link to the abstract:

    https://www.nejm.org/doi/full/10.1056/NEJMoa1908655


    ------------------------------
    Richard McNally
    Statistical Fellow
    Covance
    ------------------------------



  • 7.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 12:49
    Richard,

    Thank you very much, the link to the New England Journal of Medicine abstract is exactly what's needed.  I highly recommend it's reading to anyone following this thread who wants to understand the clinical situation as background that led to the publication of the subject AP article.  Clearly this is a very serious "near-miss" for statistical significance and justification for some equally serious modifications to the way statistical decision-making criteria are implemented within regulated industries and professions.  Assessment of the clinical evidence of a therapeutic benefit I must leave to subject-matter experts, but as a technical specialist I find it compelling.  That this near-miss actually happened speaks volumes and deserves a wide audience.

    My thanks also to author Malcolm Ritter who responded to me privately and recommended contacting Dr Scott Solomon, lead author of the NEJM publication, whose story Malcolm told in his AP article.  Thanks also to David Couper who encouraged me to contact Dr. Solomon directly, in spite of my personal reluctance.  David, your advice was excellent and perhaps it will happen now that Richard's post ha resolved my dilemma.

    Each of you has helped a lot.  All of this discussion convinces me that this story has legs, and that this is the tip of the iceberg.

    Tom

    Thomas D. Sandry, PhD
    Industrial Statistical Consultant, Retired

    ------------------------------
    Thomas Sandry
    ------------------------------



  • 8.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 09:16
    Replying to David Corliss: Downward P-hacking ("significance questing") such as you described is certainly a major, widespread problem. But we should not overlook its mirror, upward P-hacking: looking for and focusing on nonsignificance ("significance detesting") when the researchers want to report no association or effect, or reviewers and editors want to see null reports that address previous concerns, as arises in settings where there are important stakes on the null.
    See
    Antidepressant Use During Pregnancy and Autism Spectrum Disorder in Children
    For a general discussion of the problem of labeling "nonsignificant" results as "no association", see
    Invited Commentary: The Need for Cognitive Science in Methodology
    Our proposed educational reforms to mitigate P-hacking problems are at 
    https://arxiv.org/abs/1909.08579
    https://arxiv.org/abs/1909.08583



    ------------------------------
    Sander Greenland
    Department of Epidemiology and Department of Statistics
    University of California, Los Angeles
    ------------------------------



  • 9.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 10:19
    Hey Sander,
    Thanks for these links!
    That certainly is a side of the issue that gets less attention.

    ------------------------------
    Glen Wright Colopy
    DPhil Oxon
    Data Scientist at Cenduit LLC, Durham, NC
    ------------------------------



  • 10.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 10:39
    Thank you, Sander Greenland! I gave just one example as an illustration - as Sander says, p-hacking upwards is a serious problem also.

    ------------------------------
    David J Corliss, PhD
    Director, Peace-Work www.peace-work.org
    davidjcorliss@peace-work.org
    ------------------------------



  • 11.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 11:45
    Here's the link to the carton: https://xkcd.com/882/

    I think the cartoon does a wonderful job at helping folks understand the perils of searching for p<0.05 significance or in believing headlines about scientific/statistical findings uncritically.

    ------------------------------
    Eric Vance
    LISA-University of Colorado Boulder
    Associate Professor and Director
    Boulder CO, United States
    ------------------------------



  • 12.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 12:28
    When I'm teaching statistics, I use the P-value because of how common it is. I emphasize that a small P-value means we should do further investigation and it's not "proof" something is different or similar.

    For science in general, I'd love to redefine the P-value as the Statistical Power of the test. Something that happens with the statistical power, under simulation, the power is the probability of others finding your results have a "statistically significant difference". Meaning, if you found a "statistically significant difference" with a power of 0.90, about 90% of people that do your experiment will also find a statistically significant difference. If the power of your test is 0.10, you may not find a sig diff, but about 10% of people that repeat your experiment will. I know there are some issues with using statistical power. But, its easy enough to calculate with a simple Excel spreadsheet. We could tell students and scientists that they may believe the results of an experiment with any power level. However, an experiment with a power of 0.60 will likely come under fire for being unreproducible and an experiment with a power of 0.90 will probably lead to something special.

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 13.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-27-2019 13:07
    Replying to Andrew Ekstrom: You said "I emphasize that a small P-value means we should do further investigation and it's not "proof" something is different or similar." I hope you would amend to 
    "I emphasize that a small P-value means we should do further investigation and it's not 'proof' that groups are different, and that a large P-value does not necessarily mean we should not investigate further or that groups are similar."

    Then: The analysis of data via power has been proposed and criticized at length for over a quarter of a century. While at first appealing it turns out to be very tricky and prone to fallacies, perhaps even more so than basic significance and hypothesis tests. Here are a few of the items from our very own TAS that outline key problems and point to earlier works doing the same:
    https://www.vims.edu/people/hoenig_jm/pubs/hoenig2.pdf
    also see p. 7-8 here:
    https://amstat.tandfonline.com/doi/suppl/10.1080/00031305.2016.1154108/suppl_file/utas_a_1154108_sm5368.pdf


    ------------------------------
    Sander Greenland
    Department of Epidemiology and Department of Statistics
    University of California, Los Angeles
    ------------------------------



  • 14.  RE: Sorry, wrong number: Statistical benchmark comes under fire

    Posted 11-28-2019 10:26
    When I worked in the credit and Insurance industries I never approved any risk models with variables that were anything less than .001 significance. And even then I would always check to see if they added an signigicant business value.

    I know we teach .05, but it should always be taken with a grain of salt. If a significant result (statistically) is not significant (practically) then it is worthless.


    ------------------------------
    Michael Mout
    MIKS
    ------------------------------