Hi Jon. Re: "I think you overstate the 'ban' on hypothesis testing." That was not a unanimous conclusion in the 2019 AmStat special issue." I am not saying there was consensus or this was an ASA policy. I said: "... the editors in a subsequent editorial abandoned teaching statistical significance and called for a ban with the slogan "statistically significant-don't say it and don't use it" (Wasserstein et al., 2019, p. 2)." This is evident at the beginning of the article: "The editorial was written by the three editors acting as individuals and reflects their scientific views, not an endorsed position of the American Statistical Association." I am also aware of this publication by members of the ASA supporting statistical significance that is adequately applied and interpreted:
My primary aim is to contribute to the ongoing discourse and understanding of statistical significance, particularly in light of the debates and varying viewpoints.
Your reference to the legal system is interesting: "A binary decision is required as to whether or not there is sufficient evidence to meet the preassigned burden of proof." I like the analogy to the court system. In a criminal case, the prosecution must convince the jury an alleged criminal is guilty beyond a reasonable doubt. This is similar to scientific research where a small p-value (e.g., p < .05) permits the conclusion that the null is not true beyond a reasonable doubt. An alpha level of statistical significance (e.g., α = .05) defines reasonable, and when p < α, that is beyond a reasonable doubt. Nevertheless, some doubt remains whic is called Type 1 error. Interestingly, in civil cases, a guilty verdict requires a preponderance of evidence, as if α = .50. Please note that I am not proposing any specific α level, cutpoint, or bright line for statistical significance. That decision depends on the research methodology. However, statistical significance is required by scientists who believe that natural phenomena materialize randomly (probabilistically).
------------------------------
Eugene Komaroff
Professor of Education
Keiser University Graduate School
------------------------------
Original Message:
Sent: 05-20-2024 08:04
From: Jonathan Shuster
Subject: Intersection of statistical signficance and substantive significance
I think you overstate the "ban" on hypothesis testing. That was not a unanimous conclusion in the 2019 AmStat special issue. Although we all agree that estimation (confidence or credibility statements) take precedence over P-values in most situaltions, there are applications where there is no primary outcome parameter or parameters. For example, much of US law is rightly based on hypothesis testing, with varying burdens of proof required. A binary decision is required as to whether or not there is sufficient evidence to meet the preassigned burden of proof. The probability of "guilt" given the evidence is nearly always intractable.
Best,
Jon
------------------------------
Jonathan Shuster
Original Message:
Sent: 05-18-2024 09:49
From: Eugene Komaroff
Subject: Intersection of statistical signficance and substantive significance
Last summer, I participated in a somewhat lengthy discussion about statistical significance, mainly with Sander Greenland. I stated that proper education and not a ban on statistical significance is required to ameliorate abuse and misuse. Towards that end, last week my open-access paper "Intersections of Statistical Significance and Substantive Significance: Pearson's Correlation Coefficients Under a Known True Null Hypothesis" was published by QEIOS at https://www.qeios.com/read/PS72PK
This paper and a previous publication were in response to the ban on statistical significance proclaimed by the editors of a special issue of The American Statistician and by editors of Basic and Applied Social Psychology. My previous paper (Komaroff 2020) is rather dense with theoretical considerations and is behind a paywall. It has been cited by engineers working on lithium batteries and cryptocurrency but missed my target audience of students, applied researchers, and science writers. Therefore for this new paper, I kept things rather simple and open access. I simulated empirical sampling distributions of p-values and correlation coefficients that demonstrate with relatively simple graphs and 2 x 2 crosstab analyses that statistical significance is still a vital and viable tool for screening out false effect sizes (effect size errors) when working with small sample sizes (n < 1,000). In addition, many undergraduate and graduate students have told me that increasing sample size increases the chance of finding statistical significance. Unfortunately, their textbooks and professors misinformed them. The five empirical sampling distributions demonstrate that increasing sample size does not increase the probability of finding a statistically significant p-value. However, with n=1,000, statistical significance loses its purpose because non-effect sizes or very small, trivial correlations (according to Cohen's criteria) are now statistically significant.
The QEIOS paper is open for peer review. I invite your review/comments either on the QEIOS website or this ASA Connect thread or by email at komaroffeugene@gmail.com.
References
Komaroff, E. (2024). Intersections of Statistical Significance and Substantive Significance: Pearson's Correlation Coefficients Under a Known True Null Hypothesis. Qeios. doi:10.32388/PS72PK.
Komaroff, E. (2020) Relationships between p-values and Pearson correlation coefficients, Type 1 errors and effect size
------------------------------
Eugene Komaroff
Professor of Education
Keiser University Graduate School
------------------------------