Discussion: View Thread

Back to discussions

Expand all | Collapse all

Multiple comparison

1. Multiple comparison

Recommend
Cindy Weng
Posted 07-11-2012 15:10
To whom it may concern:

What and when I should concern Multiple comparison and what would be the solution to it ?
Any good reference or books ?

Thank you !
Cindy Weng
-------------------------------------------
Cindy Weng
Statistician
-------------------------------------------
2. RE:Multiple comparison

Recommend
Michael Chernick
Posted 07-11-2012 15:19
You should do it whenever you are running several simulataneous hypothesis tests.
You can apply bounds like Bonferroni or p-value adjustment for familywise error rate via resampling methods or other ways.
There are several good books on multiple testing. Jason Hsu's book is excellent. Rupert Miller's "Simultaneous Statistical Inference (a little outdated though) and the resampling approach in the text by Westfall and Young.

-------------------------------------------
Michael Chernick
Director of Biostatistical Services
Lankenau Institute for Medical Research
-------------------------------------------
3. RE:Multiple comparison

Recommend
Patrick Spagon
Posted 07-11-2012 15:25
Tamhane and Hochberg also have a text on Multiple Comparison Procedures.

http://www.amazon.com/Multiple-Comparison-Procedures-Probability-Statistics/dp/047056833X/ref=sr_1_3?s=books&ie=UTF8&qid=1342034548&sr=1-3

-------------------------------------------
Patrick Spagon
-------------------------------------------

Original Message
4. RE:Multiple comparison

Recommend
Peter Flom
Posted 07-11-2012 15:33
Oh boy, you've opened up a can of worms with this one!

A short summary is the line from Jacob Cohen in his book on multiple regression "This is a subject on which reasonable people can differ".

My own view is that people concern themselves with this far too much; but this is a symptom of a general vast over-emphasis of p--values and significance testing. My favorite professor in grad school used to say "Stop p-ing on the research!". Remember what a p-value is: It is a measure of how often, if you do totally ridiculous things, you will get significant results. More technically, it is the proportion of times you will get a test statistic as large or larger than the one you got in the sample you had, if the real effect in the population from which the sample was drawn was 0.

Unless you are on a total fishing expedition, this is a largely irrelevant question. If you are on a total fishing expedition, you have other problems to cope with! :-)

Also remember that, when you lower type 1 error (e.g. by correcting for multiple comparisons) you necessarily raise type 2 error. Which is worse? The default criteria are type 1 = 0.05 (or 0.01 sometimes) and type 2 = .2 (or .1 sometimes) (that is, power = .8 or .9). Is a type 1 error 4 times worse than a type 2 error? It depends.

Then, if you DO decide to correct for multiple comparisons, you have the very vexed question of how many comparisons you should correct for. All the analyses in one table? One article? One area of research? Or what? For example, if 100 people look at the relationship between social security number and weight, then you should correct for ALL 100, even the ones you don't know about!

In short, I think corrections for multiple comparisons are rarely needed.

Rather than evaluate research this way, I like Robert Abelson's MAGIC criteria: Magnitude, Articulation, Generality, Interestingness and Credibility. I wrote more about this in my review of Abelson's book:
http://www.statisticalanalysisconsulting.com/book-review-statistics-as-principled-argument-by-robert-abelson/

-------------------------------------------
Peter Flom
-------------------------------------------
5. RE:Multiple comparison

Recommend
Michael Chernick
Posted 07-11-2012 15:49
I can't agree with Peter on this one. I work a lot in the clinical trials arena. Hypothesis testing for saftey and effectiveness is the main analysis of phase III trials. The type I error declare a drug effective when it isn't. This type of error is very important to control from the FDA point of view. So keeping familywise error rate below 0.05 is important to FDA. This is essential when doing multiple testing or testing multiple endpoints or usign adaptive vclinical trial designs.

-------------------------------------------
Michael Chernick
Director of Biostatistical Services
Lankenau Institute for Medical Research
-------------------------------------------

Original Message
6. RE:Multiple comparison

Recommend
Peter Flom
Posted 07-11-2012 15:57
It's clearly partly dependent on field; I work mostly in the social and behavioral sciences, and I don't do RCT work (at least, not this kind).

Andrew Gelman, who is one of the best statisticians in the whole social/political/behavioral science field, agrees with me about significance.

OTOH, there is a good book (which includes stuff on clinical trials) called The Cult of Significance Testing: How the Standard Error costs us Jobs, Justice and Lives by Ziliak and McCloskey.

-------------------------------------------
Peter Flom
-------------------------------------------

Original Message
7. RE:Multiple comparison

Recommend
Michael Chernick
Posted 07-11-2012 16:00
You should mrention that Gelman is a Bayesian. So quite naturally he would dislike things of the frequentist variety.

-------------------------------------------
Michael Chernick
Director of Biostatistical Services
Lankenau Institute for Medical Research
-------------------------------------------

Original Message
8. RE:Multiple comparison

Recommend
Peter Flom
Posted 07-11-2012 16:08
True, but many frequentists have argued against the use of p-values (which is really my main point), and if you don't use p-values, the whole multiple comparison thing doesn't come up.

-------------------------------------------
Peter Flom
-------------------------------------------

Original Message
9. RE:Multiple comparison

Recommend
Cindy Weng
Posted 07-11-2012 16:23
Thank you for your comments ! Very helpful !

Cindy Weng

Original Message
10. RE:Multiple comparison

Recommend
Daniel Jeske
Posted 07-11-2012 17:31
Peter,
Wouldn't the multiple comparisons issue still come up with simultaneous confidence intervals?
Dan

-------------------------------------------
[Daniel] [Jeske]
[Professor and Chair]
[Department of Statistics]
[University of California - Riverside]
-------------------------------------------

Original Message
11. RE:Multiple comparison

Recommend
Lance Heilbrun
Posted 07-11-2012 17:37
Yes, the MC problem is still there regarding multiple confidence intervals. There is at least one published paper about that:

" False Discovery Rate-Adjusted Multiple Confidence Intervals for Selected Parameters ", by Y. Benjamini and D. Yekutieli.
J. American Statistical Assoc., 100(469):71-81, 2005.

-------------------------------------------
Lance Heilbrun
Karmanos Cancer Institute
-------------------------------------------

Original Message
12. RE:Multiple comparison

Recommend
Margot Tollefson
Posted 07-12-2012 10:36
P-values have a uniform distribution under the assumptions of the given model for the null hypothesis. If you have a lot of tests that can be considered as independent of each other, comparing the p-values too the quantiles of a zero to one uniform distribution should tell you if you have unusually small p-values.

Margot

Margot Tollefson
Owner
Vanward Statistical Consulting
-------------------------------------------

Original Message
13. RE:Multiple comparison

Recommend
Chris Barker
Posted 07-12-2012 11:07
As mentioned - Tamhane and Hochberg have an excellent text on multiple comparisons.

Peter Westfall and Alex Dmitrienko have written a number of excellent papers on the topic as well.

http://multxpert.com/wiki/Alex_Dmitrienko

http://experts.ttu.edu/browse/profile/57

Multiple comparisons is also an issue in genomics - and in that area you want to read about the False Discovery Rate the original paper by Benjamini and Hochberg.

-------------------------------------------
Chris Barker, Ph.D.
www,barkerstats.com

---
"In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
-Steve Lacy
-------------------------------------------
14. RE:Multiple comparison

Recommend
Michael Chernick
Posted 07-12-2012 11:14
Chris made some excellent recommendations. With microarrays the multiplicity of testing can be in the thousands and FWER is not useful. The false discovery rate was designed to handle such problems.

-------------------------------------------
Michael Chernick
Director of Biostatistical Services
Lankenau Institute for Medical Research
-------------------------------------------

Original Message

Discussion: View Thread

Multiple comparison

Cindy Weng07-11-2012 15:10

Michael Chernick07-11-2012 15:19

Patrick Spagon07-11-2012 15:25

Peter Flom07-11-2012 15:33

Michael Chernick07-11-2012 15:49

Peter Flom07-11-2012 15:57

Michael Chernick07-11-2012 16:00

Peter Flom07-11-2012 16:08

Cindy Weng07-11-2012 16:23

Daniel Jeske07-11-2012 17:31

Lance Heilbrun07-11-2012 17:37

Margot Tollefson07-12-2012 10:36

Chris Barker07-12-2012 11:07

Michael Chernick07-12-2012 11:14

1. Multiple comparison

2. RE:Multiple comparison

3. RE:Multiple comparison

4. RE:Multiple comparison

5. RE:Multiple comparison

6. RE:Multiple comparison

7. RE:Multiple comparison

8. RE:Multiple comparison

9. RE:Multiple comparison

10. RE:Multiple comparison

11. RE:Multiple comparison

12. RE:Multiple comparison

13. RE:Multiple comparison

14. RE:Multiple comparison