With the utmost respect to our forefathers, my theory is the following irony, based on the enmity between Karl Pearson (KP) and R.A. Fisher.
Pearson (1900) introduced the P-value with the Chi-Squared test. It included, as I recall, the CDF for Chi-Squared in regular increments, I believe Chi-squared increments of 0.1.
Fisher's 1925 text,
Statistical Methods for Research Workers, the first statistics textbook, did not include KP's table. I believe that Joan Fisher Box wrote in the biography of her father that he didn't want to ask KP's permission, or he didn't want to give KP the credit. My idea--at least I think it's my idea--is that this led Fisher to create the tables of critical values to be used in inference, rather then the entire CDF. That is, Fisher (in my mind), introduced the critical quantiles, i.e., significance levels, of 0.01, 0.05, 0.10, .... The irony is that the concept of the critical values at these probability levels--significance levels--is the first step from KP's Test of Significance with
P-values to Neyman and Egon Pearson's (1928, 1933) Test of Hypothesis with Errors of the First and Second Kind. The irony is that Fisher's ego trip versus KP, may have resulted not only in Fisher overshadowing KP, but also to a little "payback" by KP's son (with Jerzy Neyman, of course), to the extent that the Neyman-Pearson method somewhat supplanted Fisher's methods.
Do you think there's anything to this?
-------------------------------------------
Golde Holtzman
Associate Professor Emeritus
Virginia Tech (VPI)
-------------------------------------------
Original Message:
Sent: 11-24-2014 10:41
From: Herbert Weisberg
Subject: a different look at hypothesis tests
Fisher did indeed first suggest the use of 0.05 as a general guideline, but the "sanctification" of this value came later, was abhorred by Fisher, and was primarily the result of the N-P decision-theoretic formulation. I try to unravel the confusion around all this in Chapter 10 of my recent book Willful Ignorance: The Mismeasure of Uncertainty (Wiley, 2014).
-------------------------------------------
Herbert Weisberg
President
Causalytics, LLC
-------------------------------------------
Original Message:
Sent: 11-21-2014 21:03
From: David Bernklau
Subject: a different look at hypothesis tests
Dr. Elston:
Regarding your parenthetic comment, in Erich Lehmann's last book, Fisher, Neyman, and the Creation of Classical Statistics (published posthumously in 2011), after quoting a passage by Fisher in SMRW (which concludes 'We shall not often be astray if we draw a conventional line at .05 and consider that higher values of chi-square [Greek symbol used] indicate a real discrepancy'), he writes:
"This statement has been quoted in full because of its great influence. Fisher's recommendation of 5% as a fixed standard took hold and, for good or ill, has permeated statistical practice." [Page 17]
Dr. Lehmann gives examples from SMRW on Page 53, writing "His [Fisher's] interest was not in p-values, but in deciding whether or not the results were significant, where in nearly all cases he drew the line at 5% [throughout all 14 editions of SMRW]."
HTH
-------------------------------------------
David Bernklau
(David Bee on Internet)
-------------------------------------------
Original Message:
Sent: 11-20-2014 23:09
From: Robert Elston
Subject: a different look at hypothesis tests
Margot,
Your first paragraph intrigues me. It implies that if Ho is true, the false discovery rate for alpha = .05 is 0.64. In much of what I do (genetics and genomics), so many hypotheses are being tested that the proportion of true discoveries must be very small, so 0.64 would indeed be very close to the FDR. Now under certain assumptions, alpha = 0.05 not being one of them, in the long run we can expect p> 0.63 to imply Ho is true; see: http://darwin.cwru.edu/ref/view.php?id=316&article=Elston+Reprints (which followed from work I did in the 60s:http://darwin.cwru.edu/ref/view.php?id=22&article=Elston+Reprints).
What a coincidence!
(I believe it was Lehman, rather than Fisher, who promulgated the use of alpha = 0.05. But I do remember Fisher suggesting, when as an undergraduate I took a course on genetics from him, that if an outcome has a probability of only 1 in 20 of occurring by chance, this would be a reasonable criterion for believing the event did not occur by chance).
-------------------------------------------
Robert Elston
Case Western Reserve University
-------------------------------------------
Original Message:
Sent: 11-18-2014 12:43
From: Margot Tollefson
Subject: a different look at hypothesis tests
Under the null hypothesis, an hypothesis test is a Bernoulli random variable. The reciprocal of a Bernoulli random variable is a geometric random variable. That is, if alpha=0.05, then the number of trials until a false positive occurs is distributed geometrically, with mean equal to 19 and the mean number of trials until a false positive, including the false positive, equal to 20. The geometric distribution is highly skewed. The probability of seeing a false positive in less than 20 trials is 0.64, for alpha=0.05.
I think that the skewed distribution of the geometric is part of the reason that hypothesis tests do not perform well, although wide tails are probably part of the reason, too. I would suggest using the reciprocal of the median of the geometric, where the median is set equal to the value of the expected value of the geometric given the value of alpha. This is a straightforward calculation. I have done the calculations in my blog post at http://vanwardstat.wordpress.com, for anyone interested.
Kind Wishes,
Margot
-------------------------------------------
Margot Tollefson
Consultant
Vanward Statistics
-------------------------------------------