Deciding on "one-tailed vs. two-tailed" is something you do at the planning stage. I discussed the issue in:
O'Brien RG, Castelloe JM (2007), "Sample-Size Analysis for Traditional Hypothesis Testing: Concepts and Issues," in Dmitrienko A, Chuang-Stein C, D'Agostino R (Ed.), Pharmaceutical Statistics Using SAS: A Practical Guide, Cary, NC: SAS Press, 237-271. Get it here: http://hal.case.edu/~robrien/O'BrienCastelloe07.pdf As for using, say, 0.01 and 0.04 as the two alphas, this conforms to the recommendations in this cool paper by two outstanding statistical thinkers:
Rosenthal, R. and Rubin, D. B. (1983). Ensemble-adjusted p values. Psychological Bulletin, 94(3):540-541. Get it here: http://hal.case.edu/~robrien/Rosenthal83Ensemble-adjusted%20p%20values..pdf The bottom line for me is ...
The standard two-sided test is nothing but conducting two one-sided tests and correcting using alpha/2, a la Bonferroni. To say we should always do this is dogma I have never understood. Now I fight it.
We need to form analyses around the given research questions. Often the question is, Is A123 BETTER/GREATER than B456? If so, then both "A123 is 'ESSENTIALLY EQUIVALENT' to B456" and "A123 is WORSE/LESS than B456" make up a single set of "A123 is NOT BETTER / NOT GREATER than B456." This is a single question, hence no need to adjust for multiple testing. In my view, the frequentist should just find the lower limit of the upper 1-alpha (one-sided) CI, and if you really want to, the one-tailed p-value.
Two stories. (Hey, I'm 64; I've got too many stories.)
23 years ago, I helped design a trial to test a drug for a very rare genetic disease that had no known effective treatment. We needed to better balance Type I and Type II error rates, so I proposed doing a one-tailed test, maybe even at alpha = 0.10; I can't remember exactly. Herecy! Would the FDA balk? I took the pro-active route by going to DC and presenting my rationale to FDA biostatisticians face-to-face. The first hour I met with just a single person and presented my case. He then called three others and we went through it all again. After these 2.5 hours, my strategy was given thumbs up and we conducted the trial accordingly. My point? The trial had special needs so the statistical planning required custom care. BUT MANY STUDIES HAVE SPECIAL NEEDS AND THUS REQUIRE SPECIAL CARE. If the research question is one-sided, so be it. I teach this in
this "certificate" course in clinical trials put on by UCSF, a course that is loaded with FDA and industry people both as students and instructors.
About the same time, I co-authored a report of another small trial, which was published in the Annals of Internal Medicine. The written protocol had called for one-tailed tests, so that's what we reported. The statistical editor balked and in a long phone call I had with him, stated that this journal never allowed such things, because investigators might cheat by claiming a one-tailed hypothesis after seeing that their two-tailed p-value came out between 0.05 and 0.10. But we had a writting protocol! For this particular issue, it made no difference. I just doubled all the p-values in the paper and they were all still below 0.05, so everyone was blissfully happy. So silly.
And finally ...
Arguably--Oh, should I bring this up here?--the issue almost goes away when you "go Bayes." (Yes, Jason Connor, I fully 'get it' now and so do my students.) But I'll leave all that for another polemic.
-Ralph
--
Ralph O'Brien, PhD
Professor of Biostatistics
Interim Director, CCI Statistical Sciences Core
Dept of Epidemiology and Biostatistics
Case Western Reserve University
Office: 216.368.1927; Cell: 216.312.3203
"Tell me and I forget, teach me and I may remember, involve me and I learn."
― Benjamin Franklin
Original Message:
Sent: 10-29-2013 23:22
From: Stephen Simon
Subject: Choosing the number of tails for a test in planning a study
Martin Abelson also talks about this in his book, Statistics As a Principled Argument, using colorful terms like the "tail and a half test" and the "lopsided test".
-------------------------------------------
Stephen Simon
Independent Statistical Consultant
P. Mean Consulting
-------------------------------------------