Discussion: View Thread

Multiple comparison

  • 1.  Multiple comparison

    Posted 07-11-2012 15:10

    To whom it may concern:

    What and when I should concern Multiple comparison and what would be the solution to it ?
    Any good reference or books ?

    Thank you !
    Cindy Weng
    -------------------------------------------
    Cindy Weng
    Statistician
    -------------------------------------------


  • 2.  RE:Multiple comparison

    Posted 07-11-2012 15:19
    You should do it whenever you are running several simulataneous hypothesis tests.
    You can apply bounds like Bonferroni or p-value adjustment for familywise error rate via resampling methods or other ways.
    There are several good books on multiple testing.  Jason Hsu's book is excellent. Rupert Miller's "Simultaneous Statistical Inference (a little outdated though) and the resampling approach in the text by Westfall and Young.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 3.  RE:Multiple comparison

    Posted 07-11-2012 15:25
    Tamhane and Hochberg also have a text on Multiple Comparison Procedures.

    http://www.amazon.com/Multiple-Comparison-Procedures-Probability-Statistics/dp/047056833X/ref=sr_1_3?s=books&ie=UTF8&qid=1342034548&sr=1-3

    -------------------------------------------
    Patrick Spagon
    -------------------------------------------








  • 4.  RE:Multiple comparison

    Posted 07-11-2012 15:33
    Oh boy, you've opened up a can of worms with this one!

    A short summary is the line from Jacob Cohen in his book on multiple regression "This is a subject on which reasonable people can differ".

    My own view is that people concern themselves with this far too much; but this is a symptom of a general vast over-emphasis of p--values and significance testing.  My favorite professor in grad school used to say "Stop p-ing on the research!".  Remember what a p-value is: It is a measure of how often, if you do totally ridiculous things, you will get significant results.  More technically, it is the proportion of times you will get a test statistic as large or larger than the one you got in the sample you had, if the real effect in the population from which the sample was drawn was 0.

    Unless you are on a total fishing expedition, this is a largely irrelevant question. If you are on a total fishing expedition, you have other problems to cope with! :-)

    Also remember that, when you lower type 1 error (e.g. by correcting for multiple comparisons) you necessarily raise type 2 error. Which is worse? The default criteria are type 1 = 0.05 (or 0.01 sometimes) and type 2 = .2 (or .1 sometimes) (that is, power = .8 or .9).  Is a type 1 error 4 times worse than a type 2 error? It depends.

    Then, if you DO decide to correct for multiple comparisons, you have the very vexed question of how many comparisons you should correct for. All the analyses in one table? One article? One area of research? Or what? For example, if 100 people look at the relationship between social security number and weight, then you should correct for ALL 100, even the ones you don't know about! 

    In short, I think corrections for multiple comparisons are rarely needed.

    Rather than evaluate research this way, I like Robert Abelson's MAGIC criteria: Magnitude, Articulation, Generality, Interestingness and Credibility. I wrote more about this in my review of Abelson's book:
    http://www.statisticalanalysisconsulting.com/book-review-statistics-as-principled-argument-by-robert-abelson/

     

    -------------------------------------------
    Peter Flom
    -------------------------------------------








  • 5.  RE:Multiple comparison

    Posted 07-11-2012 15:49
    I can't agree with Peter on this one.  I work a lot in the clinical trials arena.  Hypothesis testing for saftey and effectiveness is the main analysis of phase III trials.  The type I error declare a drug effective when it isn't.  This type of error is very important to control from the FDA point of view.  So keeping familywise error rate below 0.05 is important to FDA.  This is essential when doing multiple testing or testing multiple endpoints or usign adaptive vclinical trial designs.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 6.  RE:Multiple comparison

    Posted 07-11-2012 15:57
    It's clearly partly dependent on field; I work mostly in the social and behavioral sciences, and I don't do RCT work (at least, not this kind).

    Andrew Gelman, who is one of the best statisticians in the whole social/political/behavioral science field, agrees with me about significance.

    OTOH, there is a good book (which includes stuff on clinical trials) called The Cult of Significance Testing: How the Standard Error costs us Jobs, Justice and Lives by Ziliak and McCloskey.

    -------------------------------------------
    Peter Flom
    -------------------------------------------








  • 7.  RE:Multiple comparison

    Posted 07-11-2012 16:00
    You should mrention that Gelman is a Bayesian.  So quite naturally he would dislike things of the frequentist variety.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------








  • 8.  RE:Multiple comparison

    Posted 07-11-2012 16:08
    True, but many frequentists have argued against the use of p-values (which is really my main point), and if you don't use p-values, the whole multiple comparison thing doesn't come up.

    -------------------------------------------
    Peter Flom
    -------------------------------------------








  • 9.  RE:Multiple comparison

    Posted 07-11-2012 16:23
    Thank you for your comments ! Very helpful !

    Cindy Weng




  • 10.  RE:Multiple comparison

    Posted 07-11-2012 17:31
    Peter,
    Wouldn't the multiple comparisons issue still come up with simultaneous confidence intervals?
    Dan

    -------------------------------------------
    [Daniel] [Jeske]
    [Professor and Chair]
    [Department of Statistics]
    [University of California - Riverside]
    -------------------------------------------








  • 11.  RE:Multiple comparison

    Posted 07-11-2012 17:37
    Yes, the MC problem is still there regarding multiple confidence intervals.   There is at least one published paper about that:

    " False Discovery Rate-Adjusted Multiple Confidence Intervals for Selected Parameters ",  by Y. Benjamini and D. Yekutieli.

    J. American Statistical Assoc., 100(469):71-81, 2005.


    -------------------------------------------
    Lance Heilbrun
    Karmanos Cancer Institute
    -------------------------------------------








  • 12.  RE:Multiple comparison

    Posted 07-12-2012 10:36
    P-values have a uniform distribution under the assumptions of the given model for the null hypothesis.  If you have a lot of tests that can be considered as independent of each other, comparing the p-values too the quantiles of a zero to one uniform distribution should tell you if you have unusually small p-values.

    Margot


    Margot Tollefson
    Owner
    Vanward Statistical Consulting
    -------------------------------------------








  • 13.  RE:Multiple comparison

    Posted 07-12-2012 11:07
    As mentioned  - Tamhane and Hochberg have an excellent text on multiple comparisons.

    Peter Westfall and Alex Dmitrienko have  written a number of excellent papers on the topic as well.

    http://multxpert.com/wiki/Alex_Dmitrienko

    http://experts.ttu.edu/browse/profile/57

    Multiple comparisons is also an issue in genomics - and in that area you want to read about the False Discovery Rate the original paper by Benjamini and Hochberg.


    -------------------------------------------
    Chris Barker, Ph.D.
    www,barkerstats.com

    ---
    "In composition you have all the time you want to decide what to say in 15 seconds, in improvisation you have 15 seconds."
    -Steve Lacy
    -------------------------------------------








  • 14.  RE:Multiple comparison

    Posted 07-12-2012 11:14
    Chris made some excellent recommendations.  With microarrays the multiplicity of testing can be in the thousands and FWER is not useful.  The false discovery rate was designed to handle such problems.

    -------------------------------------------
    Michael Chernick
    Director of Biostatistical Services
    Lankenau Institute for Medical Research
    -------------------------------------------