ASA Connect

 View Only
Expand all | Collapse all

Cut Point

  • 1.  Cut Point

    Posted 08-04-2023 05:07

    Dear Sander.  For some reason, I am unable to reply to group in the "hypothesis formulation" thread, so am starting a new thread. 

     
    You asked:  "Are you claiming that p=0.049 and p=0.051 (or 0.0049 and 0.0051 or whatever cutpoint straddling you choose) are scientifically different results, regardless of context and method choices, so that the first is always "significant" and the second is always not, given the cutpoint?" 

    We make a choice between two options in our daily lives. The English band Clash sang about a cut point: "Should I stay or should I go." When I was teaching my son to drive, I asked him: What will you do when the green traffic light turns yellow at an intersection? He said "Stop" and I said "Go, go, go." Both are sensible actions, but the decision depends on who else, or what else, besides the changing traffic light is at the intersection. 

    In statistics, if I want to calculate the probability of seeing 8 or more heads in 10 flips of a fair coin, I do not admit the real possibility of a coin landing on its side into the binomial calculation. Even the word binomial, (two names) indicates a cut point, regardless of whether it was created by Nature or by people. Your question also reminds me of Fisher's remark (I paraphrase because I don't remember where I read this): Researchers have strenuous objections to the statistical significance concept when p = .051 but are delighted when p =.049. I am not proclaiming any fixed cut point (alpha) as a standard for decision making. Nonetheless, I believe a cut somewhere on the 0 to 1 (exclusive) probability scale is necessary but not sufficient for choosing between two options.


    Thinking about living in a world without a cut point reminds me of joke Johnny Carson told on his Tonight Show: When you reach the fork in the road, take it!  I'm on the road called Null Hypothesis and this road splits into "reject the null hypothesis" or "do not reject the null hypothesis." This road doesn't split into "null hypothesis" or "alternative hypothesis." Is the latter the road you're on? 

    In summary, a statistical significance cut point is a pre-requisite for substantive significance. Statistical significance is necessary but not sufficient. The conditional decision that follows p < α is the most important. Is the alternative parameter within a 1-alpha confidence interval, a practical, meaningful, substantive, credible, accurate estimate? I leave that decision to content experts. A statistician can offer only a generic effect size, but that is merely a tasty appetizer - not a full meal. 

    Best wishes,
    Eugene



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------


  • 2.  RE: Cut Point

    Posted 08-07-2023 14:59

    Dear Eugene,

    Due to its being on a new thread, I only just now saw your response below. I encourage all readers here to first refer back to the lengthy thread on "hypothesis formulation" from which this one on cutoffs (cut points) originated, where they will see I have already made most of the following points in some detail, supplied many citations which discuss further details along with empirical evidence of the problems with "statistical signficance", and review some propose solutions for teaching and practice.

    It seems you have started your current reasoning with examples that satisfy an assumption which is violated in most cases of modern research reporting: That the research is charged with making a decision about the estimands (target parameters) under study. This assumption was not too unreasonable for Student (Gosset), charged as he was with making recommendations for crop strains; likewise for much of Fisher's work, which was then idealized by Neyman and Egon Pearson in their formal decision theory. Nonetheless problems with misinterpretations of what came to be known as "statistical significance" began in the 19th century and were described by Karl Pearson as early as Biometrika 1906. Then, by the mid-20th century the research literature became dominated by a confusion Fisherian and NP theory in which "significance" was assessed only for null ("no effect" or "no association") hypotheses, the infamous NHST. Meanwhile, in the math-stat literature Neyman-Pearson (NP) had become so dominant as to lead some to equate statistical inference to decision theory - a problem decried by Fisher himself (JRSS B, 1955). 

    This equation of statistical inference with decisions (such as "significant" vs. "nonsignificant") is a false one: While some settings call for decisions, by the 1950s Cox, Lehmann, and even Egon Pearson (who backed away from Neyman's behavioristic views) had come to recognize that, for much of the then-burgeoning realm of research, the primary purpose of most study reports was to
    1) accurately describe how the data were generated and then
    2) summarize the data set and the information it contained in a form that could be pooled with other information (whether to aid hypothesizing of explanations of literature patterns, or to reach policy decisions).
    This is why they advised, quite explicitly, that P-values ("observed significance levels") be reported in continuous form, as could now be done to reasonable accuracy with advent of electronic computers, along with estimates. To use your analogy, when (as almost always the case in pure research) there is no universal justification for a single cutpoint there is no justification for forcing the reader down one branch in a fork with artificial and unnecessary declarations of "significance" or "nonsignificance". Around the same time, Cox (AMS 1958, p. 363) and Birnbaum (JASA 1961) further recognized that one could bypass using a cutoff for interval estimates by showing a P-value function (or as they and the subsequent math-stat literature called it, a "confidence distribution" or "confidence curve"). As they said, by following their presentation advice, if the reader wanted to make a decision they could compare p to whatever cutoff they chose and could use whatever cutoff they wanted to construct their interval estimate. 

    Cox, Hill and others were also clear that these statistics were not sufficient for real-world decision making whenever those decisions required examining multiple lines of evidence. A case study in which Fisher failed to make such examinations competently and thus badly tarnished his own reputation was recounted by Stolley 1991 ("When genius errs: R. A. Fisher and the lung cancer controversy", American Journal of Epidemiology, 133, 416-425): By the end of the 1950s the relation of cigarette smoking to lung cancer was recognized as incredibly strong, "statistically significant", and as biologically supported as that of any lifestyle factor ever seen; and sensitivity analyses made clear that explanations in terms of confounding (self-selection) bias were farfetched (Cornfield et al. 1959. "Smoking and lung cancer: recent evidence and a discussion of some questions". J Natl Cancer Inst, 22, 173-203.).

    Statistically, there were over a dozen human studies that needed to be pooled, not acted on individually. Thus the central point of contention in policy decisions was not whether some p-cutoff was met, but whether a decision could be reached based on the totality of evidence from all lines: pathophysiologic theory, cellular and animal experiments, and the many human observational studies (no randomized human trials, which was one of Fisher's primary objection to the decision to declare smoking a health hazard). This inferential settting was the central topic of Sir Austin Bradford Hill's famous paper, "The environment and disease: association or causation?" (Proc R Soc Med 1965;58:295–300), in which p-cutoffs were not even mentioned and "statistical significance" did not appear in his list of considerations for causal inference; Hill instead presented his list and then wrote that significance tests (by which he meant P-values, as in Cox & Hinkley 1974, Theoretical Statistics, Ch. 3) "can, and should, remind us of the effects that the play of chance can create, and they will instruct us in the likely magnitude of those [chance] effects. Beyond that they contribute nothing to the 'proof' of our [causal] hypothesis" (Hill, p. 299). That quote is of course in some accord with your comment about "statistical significance" serving as an appetizer, but we remain at odds about the phrasing and implementation of that notion, with you sharply committed to tradition and I sharply opposed because of the well-documented destructiveness of the tradition. 

    Many have ascribed the ongoing defenses of "statistical significance" to defensiveness about past teaching and practice. While I am convinced that is a major factor, I also see at work the basic human impulse toward oversimplification, as manifested by reduction of statistics to simple binary descriptions. Again, this impulse has been quite destructive, both in encouraging misreporting of results (especially in reporting "no association" because p>0.05, which Pearson 1906 warned against) and in creating severe publication bias, as seen for example in Fig. 1 of van Zwet & Cator 2021, https://onlinelibrary.wiley.com/doi/full/10.1111/stan.12241.

    Astute statisticians often lament degradation of crucial covariates into dichotomies (as in the horrid analyses of age-treatment interactions based solely on an age cutoff, e.g., "<60 vs. 60+"), yet too many seem tolerant of the similar information degradation inherent in saying "statistically significant". The misplaced impulse toward unnecessary (and often misleading) forcing of "significance" declarations from statistical outputs is a form of dichotomania, also known as "black-or-white thinking" (see Greenland 2017. "The need for cognitive science in methodology." American Journal of Epidemiology, 186, 639-645, https://academic.oup.com/aje/article/186/6/639/3886035). It reflects a compulsion to degrade detailed information into binary indicators, even when preserving the information would involve no cost. For example, it takes less space and is less ambiguous to write "the result had p=0.023" than to say "the result was statistically significant", noting the conflict over what should be a standard reference cutoff (e.g., 0.05 vs. 0.005). The reader can compare p=0.023 to any cutoff (provided the cutoff and the reason for its choice are given). 

    I regret that I have not seen where you address these views or the citations I have given in the "hypothesis formulation" thread, nor have I seen your proposals for teaching, practice, review and editorial innovations to address the ongoing literature problems flowing from the mindset and language of "statistical significance". For at least a half century that mindset and language has been condemned by many statisticians and researchers alike as toxic and outdated. It is of little comfort that other sciences have faced equally tenacious resistance to change. In the early 20th century, some physicists struggled to preserve absolute space and time as foundational concepts, rather than as simplifying constructs useful in certain domains; for example, Albert Michelson himself (famed for the Michelson-Morely experiment and first American Nobelist in physics) was reputed to have never accepted relativity. Perhaps that was an example Max Planck had in mind when he made a remark that we now compress into the aphorism that "science progresses funeral by funeral". Alas, it seems that on basic matters, statistical science progresses at a geological pace.

    Best Wishes,
    Sander



    ------------------------------
    Sander Greenland
    Department of Epidemiology and Department of Statistics
    University of California, Los Angeles
    ------------------------------



  • 3.  RE: Cut Point

    Posted 08-09-2023 09:33

    Dear Sander. Thanks for continuing our discussion on this new thread because I am feeling a common ground. Below I parse some of your statements and afterwards my reaction/reply. 

    "It seems you have started your current reasoning with examples that satisfy an assumption which is violated in most cases of modern research reporting: That the research is charged with making a decision about the estimands (target parameters) under study." 

    I have no problem with confidence intervals as estimands. I just do not understand the benefit of a confidence interval instead of p-values as a statistical test of significance. 

    "Nonetheless problems with misinterpretations of what came to be known as 'statistical significance' began in the 19th century and were described by Karl Pearson as early as Biometrika 1906. 

    I am eager to see "statistically significant – don't say it" become a rule. It is bewildering why the meaning of a relatively simple concept about excluding random chance as a rival explanation for a result has evolved into mindless misinterpretations of great importance.

    "This equation of statistical inference with decisions (such as "significant" vs. "nonsignificant") is a false one."

    Here I do not understand. Are you criticizing the word "significant" or the "go/no go" dichotomy in general? If the problem is the word "significant", then I agree it is high time for a suitable alternative that does not carry a weighty implication of "significance". 

    "…the primary purpose of most study reports was to 1) accurately describe how the data were generated" 
    Surely only an ignorant or an unethical researcher would hide unanticipated "shocks" (as these are called in econometrics) during the data collection process that skewed results. 

    "This is why they advised, quite explicitly, that P-values ("observed significance levels") be reported in continuous form, as could now be done to reasonable accuracy with advent of electronic computers, along with estimates." 

    Here we disagree. I see no point in reporting a difference in means or proportions followed by a p-value with no interpretation. Besides, Wasserstein, Schirm & Lazar (2019) posited that "no p-value can reveal the plausibility, presence, truth, or importance of an association or effect." I recall reading an article in a leading medical journal where an author stated that "two variables were strongly related (p = .001)." There was no correlation coefficient reported and the p-value was infused with wishful thinking. 

    "if the reader wanted to make a decision they could compare p to whatever cutoff they chose and could use whatever cutoff they wanted to construct their interval estimate." 

    Yes, I agree. 

    "A case study in which Fisher failed to make such examinations competently and thus badly tarnished his own reputation was recounted by Stolley 1991 ("When genius errs: R. A. Fisher and the lung cancer controversy" ,American Journal of Epidemiology, 133, 416-425):" 

    According to https://www.psycom.net/cognitive-dissonance, Festinger <father of cognitive dissonance theory> gave the following example: "A heavy smoker who knows smoking is bad for his health will experience dissonance because he continues to puff away. He can reduce the dissonance by quitting smoking; changing his beliefs on the effect smoking has on his health (that it doesn't cause lung cancer); adding a new belief by looking for the positive effects of smoking (it reduces anxiety and weight gain); reducing the importance of the belief by convincing himself that the risks of smoking are miniscule compared to the risk of an automobile accident." I have no doubt that Fisher believed that correlation does not imply causation; therefore the relationship may be  suggestive but is not conclusive evidence of cause-effect. Besides Fisher was a heavy pipe smoker so may have experienced cognitive dissonance. He postulated that a gene, a lurking common variable, could explain both a penchant for smoking and a propensity for lung disease. 

    "This inferential settting was the central topic of Sir Austin Bradford Hill's famous paper, in which p-cutoffs were not even mentioned and "statistical significance" did not appear in his list of considerations for causal inference…" 

    Yes, I am aware of Hill's list for causal reasoning with observational data: 
    1.    Strength of Association:  Stronger association is more likely to have a causal component than is a modest association.
    2.    Consistency: A relationship is observed repeatedly
    3.    Specificity:  A factor influences specifically a particular outcome or population
    4.    Temporality: The factor must precede the outcome it is assumed to affect
    5.    Biological Gradient:  The outcome increases monotonically with increasing does of exposure or according to a function predicted by a substantive theory
    6.    Plausibility: The observed association can be plausibly explained by substantive matter (e.g., biological) explanations
    7.    Coherence: A casual conclusion should not fundamentally contradict present substantive knowledge
    8.    Experiment: Causation is more likely if evidence is based on randomized experiments
    I found the list in an instructive article: Rothman K.J. and Greenland S. Causation and causal inference in epidemiology. American Journal of Public Health. 2005, 95(S1). S144-S150. The paper begins with an intriguing statement: "concepts of cause and causal inference are largely self-taught from early learning experiences." 

    "I also see at work the basic human impulse toward oversimplification, as manifested by reduction of statistics to simple binary descriptions." 

    Reminds me of Einstein's quote: "Everything should be made as simple as possible, but no simpler." For me, "reduction of statistics to simple binary description" is not the end, but the start of statistical reasoning. 
    "Astute statisticians often lament degradation of crucial covariates into dichotomies (as in the horrid analyses of age-treatment interactions based solely on an age cutoff, e.g., "<60 vs. 60+")" 

    I have wondered why epidemiologists favor categorizing continuous variables, like age, in their statistical analyses. Could it be that odds ratios (OR) are easier to interpret/understand as compared to the ORs with continuous explanatory variables?  

    "I regret that I have not seen where you address these views or the citations I have given in the "hypothesis formulation" thread, nor have I seen your proposals for teaching, practice, review and editorial innovations to address the ongoing literature problems flowing from the mindset and language of "statistical significance".

    Alright, let me try a bit of teaching. Fisher started the "significance" conundrum with p < .05. Calling any p-value as "statistically significant" has misled researchers into thinking they found something important. Fisher's statistical cut point is actually trivial compared to a cut point on a Medical Board Licensing Exam, for instance, that a young MD, who spent a lot of money on tuition, must get over (pass) to practice medicine.  A summary statistic that is "approximately two standard errors away from a null parameter" may be relatively unusual and that is all. It is time to eradicate "statistically significant" from statistical jargon. Instead, I would call any observed p-value an "Xth percentile p-value." For instance, p = .049 would be the "4th percentile p-value" and p=.051 would be the "5th percentile p-value." Notice no inequality symbols because that is understood with percentiles. I am writing a paper for publication that demonstrates the "percentile p-value" concept with numbers and graphs. However, I would be delighted to elaborate here, if you want.

    Furthermore, I see no reason for the adjective "null" to modify any hypothesis. There has been much confusion with NHST referred to also as "Nil" Hypothesis Significance Test. The "0" parameter was necessary when T-tests were done with mechanical calculators and a Fisher-Yates table was the easiest/quickest way to determine the probability. These days, a researcher can run many T-tests quickly with statistical software on a laptop where a p-value is revealed almost instantly. In addition, statistical software, such as SAS, permits the specification of any reasonable, exact parameter with a simple fill in the blank option: "H = _ " . 

    Nonetheless, I will continue to object to "statistically significant – don't use it." Fisher's beautiful and relatively simple statistical thinking is impossible without an alpha percentile p-value cut point. Of course, any alpha percentile level must be declared in light of confounded research design and execution issues. 

    Best wishes. 
    Eugene



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------



  • 4.  RE: Cut Point

    Posted 08-10-2023 13:52

    I hesitate to get involved in this (interesting) discussion, but Eugene wrote 

    If the problem is the word "significant", then I agree it is high time for a suitable alternative that does not carry a weighty implication of "significance". 

    Yesterday at JSM I wore a t-shirt that used to say "I'm Statistically Significant" but I changed it to "I'm Statistically Discernible." I did this in support of my ongoing promotion of replacing "statistically significant" with "statistically discernible." See an editorial I wrote a while back: https://www.tandfonline.com/doi/full/10.1080/10691898.2019.1702415

    Jeff Witmer



    ------------------------------
    Jeffrey Witmer
    Professor
    Oberlin College
    ------------------------------



  • 5.  RE: Cut Point

    Posted 08-11-2023 08:40

    Hello Jeffrey, you have the right idea. The meaning of  "statistically significant" has been distorted/profused into all sorts of unrelated concepts, and that includes the alternative hypothesis. One problem is that significant, like discernable, is an  adjective. The second problem is according to the Mirriam-Webster dictionary on Google there are many synonyms or similar words for discernable: "Distinguishable, perceptible, noticeable, detectable, obvious, appreciable, distinct, significant, palpable, apparent, identifiable, visible, sensible, tangible, observable, apprehensible, audible, evident, prominent, conspicuous, striking, manifest, clear, ponderable."  Notice that significant is also on the list. I don't see how replacing one adjective with another similar one helps clean the muddy confusion.   

    One cause of the misunderstanding is the missing noun after "statistically and significant". What exactly is "statistically significant?"  In my reading of Fisher (1925), my first encounter with significant was in the following sentence: "Diviations exceeding twice the standard deviation are thus formally regarded as significant."  For Fisher, it was the difference (distance) between "a summary statistic and a hypothesized parameter" in standard error units that could be significant. Fisher was probably thinking "z-test" where the distance is quantified with standard deviation units.   

    Fisher R.A. (1925). Statistical Methods for Research Workers. Edinburgh: Oliver & Boyd.

      



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------



  • 6.  RE: Cut Point

    Posted 08-11-2023 12:27

    Weighing in as a clinical trial biostatistician: Regulators and presenters of clinical trial results must convey clinical trial success decisions, and an important component is statistical evidence. The use of the term "significant" is pervasive because statistical success criteria are based on a P values. The term "significant" is commonly used to convey statistical success when there are planned criteria for success, but is also naively used when reporting the results of tertiary (exploratory) analyses or "waterboarding" data without planned criteria for success. Thus, replacing the appropriated term "significant" is a good thing. However, proposed replacements to be used in the context of the existence of planned statistical success criteria must convey the existence of such, something like "meets specified criterion." When results without planned statistical success criterion are presented the term "significant" could be replaced by candidates like "exploratory,"  "possibly confounded," "model derived with subjectively selected terms,"  "data-directed," "the result of waterboarding," ... :-) 



    ------------------------------
    Brent Blumenstein
    ------------------------------



  • 7.  RE: Cut Point

    Posted 08-12-2023 05:34

    I believe "significant" for Fisher meant only that chance or coincidence was ruled out as explanation for an effect. That should be the end of statistical thinking and the start of substantive, practical, meaningful thinking about the "significance" of the effect.

    Eugene 



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------



  • 8.  RE: Cut Point

    Posted 08-12-2023 11:53

    Yes, that's what Fisher meant. And if he had not come up with a word and we were given the task today of choosing the worst possible word to attach to the finding of a small p-value, the word "significant" would be a leading contender. Maybe you can think of a worse word than "significant," but given how people use "significant" in everyday life, outside of statistical conversations, it is a horrible word for statisticians to use in connection with hypothesis testing. 

    Indeed, I've seen statisticians confuse themselves: first they get a small p-value and call their result "significant" (but maybe mumble something about this being different than "important") and a while later they talk about their findings as being important solely because they earlier found a small p-value, having forgotten that their effect size was tiny (but a large sample size led to their p-value). I think that kind of error would become vanishingly rare if we said that a small p-value tells us that the result is "statistically discernible," as in "yes, we had enough data to be confident that what we saw wasn't due to sampling variability -- the test discerned something -- but that's all; whether or not this finding is important is (roughly) orthogonal to the p-value calculation."

    Jeff



    ------------------------------
    Jeffrey Witmer
    Professor
    Oberlin College
    ------------------------------



  • 9.  RE: Cut Point

    Posted 08-12-2023 15:29

    Jeffrey:

    I liked your JSE editorial, and it seems we all agree that "statistically discernible" is superior to "statistically significant".
    I also think "discernible" needs more elaboration to accommodate the call for viewing P-values as continuous measures.

    To paraphrase my last post: Just as we can present and interpret r2 as a basic continuous index of linearity ranging from 0 to 1, we can present and interpret a P-value as a basic continuous index of compatibility (or as Cox put it, consistency, or Kempthorne & Folks put it, consonance, or as Pearson put it, goodness-of-fit) between the data and a hypothesis or model for the actual data-generating process. And a P-value can be transformed into a simple measure of the information it supplies against the hypothesis or model: the binary surprisal or S-value s = log2(1/p) = -log2(p).

    We could say P-values index degrees of discernibility ranging from p=1 when p finds no discernible difference between the data and the model (although other statistics might find differences), to p=0 when the data logically contradict the model. But then this scale is backwards, as we should want to index discernibility from none=0 to complete=1. That could be done with 1-p, which is an index incompatibility, inconsistency, dissonance or badness-of-fit between the data and the hypothesis or model. The S-value could also serve as a discernibility measure, with s=0 indicating no discernible difference and larger s corresponding to more discernible differences.

    One quibble: I would strongly encourage reporting P-values to at least two "significant" digits! I have had experiences in which upon close review of a paper I suspected errors, or wished to compute unreported quantities. P-values reported to only one digit were nearly useless for that task, whereas to two or more digits they enabled error detection and further computations. Thus, the second digit can indeed matter in practical application, whether the P-value is large or small, and so the P-value is an example where both too much and too little precision are possible.

    Finally, some relevant historical points:

    1) Fisher did not initiate the use of "significant" and its variations for statistical tests, nor did he invent its use for dealing with chance variation over repeated sampling. The earliest cite I know of is Edgeworth 1885 (before Fisher was born in 1890), e.g., see p. 183 onward of http://www.jstor.org/stable/25163974
    The usage can also be found in pre-Fisher classics like Pearson 1900 and Student 1908, and was apparently regular terminology by the time Fisher encountered statistics.
    (It seems Fisher gets blamed or credited for nearly everything, including much that he adopted from the statistical theory of his student days, e.g. maximum likelihood; see Pratt 1976, https://www.jstor.org/stable/2958222 and his cites, esp. Neyman 1951 and Pearson 1967.)

    2) By 1900 Pearson was using "value of P" for what was also called the significance level. For Fisher a "significance level" was usually not a cutpoint, but was rather the observed tail-area P-value. Fisher also sometimes used "value of P"; the equivalent term "P-value" can be found in research reports by the 1920s. Of course, P-values had been around since the early 18th century and P-fishing appeared by the early 19th century, just not by those names; see Shafer, .http://www.probabilityandfinance.com/articles/55.pdf

    3) Calling a fixed cutoff the "significance level" can be found in American writings by the 1950s. This usage has generated enormous confusion for users since now "significance level" blurs the crucial distinction between the observed P-value p and a pre-specified cutoff (alpha level) it is to be compared to (the alpha level of Neyman-Pearson hypothesis testing).

    Best,
    Sander



    ------------------------------
    Sander Greenland
    Department of Epidemiology and Department of Statistics
    University of California, Los Angeles
    ------------------------------



  • 10.  RE: Cut Point

    Posted 08-14-2023 07:51

    Another source of confusion between statistical and substantive (practical) significance is Fisher's (1970) title for Chapter V: Tests of Significance of Means, Differences of means, and Regression Coefficients.  This title implies substantive significance, however, the content is all about standard errors and statistical significance.

    I really wonder why statisticians interpret substantive significance for researchers. Such a weighty consideration belongs in the  domain of researchers who have content knowledge and expertise.

    Eugene 



    ------------------------------
    Eugene Komaroff
    Professor of Education
    Keiser University Graduate School
    ------------------------------



  • 11.  RE: Cut Point

    Posted 08-11-2023 17:32
    Edited by Eric J. Daza 08-11-2023 17:42

    Love your editorial, @Jeffrey Witmer!

    I'd penned my own editorial encouraging the use of "statistically discernible" instead of "statistically significant" in 2021 in Towards Data Science in Ditch "Statistical Significance" - But Keep Statistical Evidence. I'd also provided example text researchers can use in papers, talks, and grant applications.

    I wish I'd known about your editorial, but am delighted to know about it now. It will give much more authoritative support to my own piece. I'll add it to my references and resources just now.

    Best,
    Eric Jay / EJ



    ------------------------------
    -
    Eric J. Daza, DrPH, MPS 🇺🇸🇵🇭 (he/him) | linktr.ee/ericjdaza
    Founder + Chief Editor | statsof1.org
    Health Innovator | Forbes + Fortune + ASA
    Lead Biostatistician (Data Science) | evidation.com
    ------------------------------