ASA Connect

 View Only
  • 1.  Intersection of statistical signficance and substantive significance

    Posted 05-18-2024 09:50
    Last summer, I participated in a somewhat lengthy discussion about statistical significance, mainly with Sander Greenland. I stated that proper education and not a ban on statistical significance is required to ameliorate abuse and misuse. Towards that end, last week my open-access paper "Intersections of Statistical  Significance and Substantive  Significance: Pearson's Correlation Coefficients Under a Known True Null Hypothesis" was published by QEIOS at https://www.qeios.com/read/PS72PK 
     
    This paper and a previous publication were in response to the ban on statistical significance proclaimed by the editors of a special issue of The American Statistician and by editors of Basic and Applied Social Psychology. My previous paper (Komaroff 2020) is rather dense with theoretical considerations and is behind a paywall. It has been cited by engineers working on lithium batteries and cryptocurrency but missed my target audience of students, applied researchers, and science writers. Therefore for this new paper, I kept things rather simple and open access. I simulated empirical sampling distributions of p-values and correlation coefficients that demonstrate with relatively simple graphs and 2 x 2 crosstab analyses that statistical significance is still a vital and viable tool for screening out false effect sizes (effect size errors) when working with small sample sizes (n < 1,000). In addition, many undergraduate and graduate students have told me that increasing sample size increases the chance of finding statistical significance. Unfortunately, their textbooks and professors misinformed them. The five empirical sampling distributions demonstrate that increasing sample size does not increase the probability of finding a statistically significant p-value. However, with n=1,000, statistical significance loses its purpose because non-effect sizes or very small, trivial correlations (according to Cohen's criteria) are now statistically significant. 
     
    The QEIOS paper is open for peer review. I invite your review/comments either on the QEIOS website or this ASA Connect thread or by email at komaroffeugene@gmail.com
     
    References
     
    Komaroff, E. (2024). Intersections of Statistical Significance and Substantive Significance: Pearson's Correlation Coefficients Under a Known True Null Hypothesis. Qeios. doi:10.32388/PS72PK.
     
    Komaroff, E. (2020) Relationships between p-values and Pearson correlation coefficients, Type 1 errors and effect size
    errors, under a true null hypothesis. Journal of Statistical Theory and Practice, 14, 49. https://doi.org/10.1007/s42519020-00115-6


    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------


  • 2.  RE: Intersection of statistical signficance and substantive significance

    Posted 05-20-2024 08:05

    I think you overstate the "ban" on hypothesis testing.  That was not a unanimous conclusion in the 2019 AmStat special issue. Although we all agree that estimation (confidence or credibility statements) take precedence over P-values in most situaltions, there are applications where there is no primary outcome parameter or parameters. For example, much of US law is rightly based on hypothesis testing, with varying burdens of proof required. A binary decision is required as to whether or not there is sufficient evidence to meet the preassigned burden of proof.  The probability of "guilt" given the evidence is nearly always intractable.

    Best,

    Jon



    ------------------------------
    Jonathan Shuster
    ------------------------------



  • 3.  RE: Intersection of statistical signficance and substantive significance

    Posted 05-20-2024 15:08

    Hi Jon.  Re: "I think you overstate the 'ban' on hypothesis testing."  That was not a unanimous conclusion in the 2019 AmStat special issue."   I am not saying there was consensus or this was an ASA policy.  I said: "... the editors in a subsequent editorial abandoned teaching statistical significance and called for a ban with the slogan "statistically significant-don't say it and don't use it" (Wasserstein et al., 2019, p. 2)."  This is evident at the beginning of the article: "The editorial was written by the three editors acting as individuals and reflects their scientific views, not an endorsed position of the American Statistical Association."  I am also aware of this publication by members of the ASA supporting statistical significance that is adequately applied and interpreted: 

    Support of p-values and statistical significance

    My primary aim is to contribute to the ongoing discourse and understanding of statistical significance, particularly in light of the debates and varying viewpoints.   

    Your reference to the legal system is interesting: "A binary decision is required as to whether or not there is sufficient evidence to meet the preassigned burden of proof."  I like the analogy to the court system.  In a criminal case, the prosecution must convince the jury an alleged criminal is guilty beyond a reasonable doubt.  This is similar to scientific research where a small p-value (e.g., p < .05) permits the conclusion that the null is not true beyond a reasonable doubt.  An alpha level of statistical significance (e.g., α = .05) defines reasonable, and when p < α, that is beyond a reasonable doubt. Nevertheless, some doubt remains whic is called Type 1 error.  Interestingly, in civil cases, a guilty verdict requires a preponderance of evidence, as if α = .50.  Please note that I am not proposing any specific α level, cutpoint, or bright line for statistical significance. That decision depends on the research methodology. However, statistical significance is required by scientists who believe that natural phenomena materialize randomly (probabilistically). 



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------



  • 4.  RE: Intersection of statistical signficance and substantive significance

    Posted 05-20-2024 08:22

    Eugene:

    Your comment that students think that increasing sample size increases the probability of significance is interesting.

    I teach power and the value of increasing sample size in decreasing type II error (and thus of obtaining significance when the null is false).

    But, I don't know that I ever explicitly state, "Increasing sample size does not increase the probability of significance when the null hypothesis is TRUE". Maybe I should! It's obvious to a statistician and follows from the logic of hypothesis testing, but may not be obvious to students.

    Ed



    ------------------------------
    Edward Gracely
    Associate Professor
    Drexel University
    ------------------------------



  • 5.  RE: Intersection of statistical signficance and substantive significance

    Posted 05-20-2024 16:12

    Hi Ed.  I will touch upon power in another paper, which I will present at JSM 2024 on Aug. 7th in the "Theory and Methods for Variable Section " (2:00 - 3:50 PM).  It is the same story but with a new cast of characters. I consider the statistical significance of the differences in means with independent samples t-tests and evaluate substantive significance with Cohen's d under a true null hypothesis. In this case, the sampling distribution of p-values is uniform when the null parameter is true. However, when the null is false, the sampling distribution of p-values is right-skewed (not uniform). This reminds me of working on analysis plans for several grants where the cost of study participants put the projects over budget. A typical directive from the researchers: Figure out a way to reduce the sample size.  I responded by increasing the difference (distance) between the null and alternative parameters. I "bought" the same power with a smaller sample size. Incidentally, students and applied researchers may not appreciate that the new larger sample must have the same characteristics as the previous smaller sample, or they may still fail to reject the null hypothesis. 



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------