If the data in your groups have unequal variances, but otherwise are reasonably normally distributed, then it may be inefficient to use nonparametric approaches, and more efficient to take a mixed-models approach. Once upon a time, I had data from an experiment in rats, where the data for each group had clearly unequal variances but otherwise was reasonably normally distributed. What I did using SAS was, first I added to my data set a variable called _dummy_, which consisted of a vector of all 1's, nothing else. Then I used the Repeated statement in SAS Proc Mixed with _dummy_ as my repeated-measures variable and with a "group=" option to model unequal group variances. Finally, I added to the model statement the "ddfm=" option to tell SAS to use the Satterthwaite method to compute denominator degrees of freedom. My pairwise comparisons conducted in this manner matched exactly the results I obtained by doing pairwise Welch's t-tests, plus this approach gave me the ability to do something I could not have done otherwise: estimate an interaction term (i.e., a difference of the differences) in the presence of the unequal variances. The SAS code is copy-pasted below:
proc mixed data=bmdrat;
class subject_id groups _dummy_;
model change = groups /ddfm=satterth;
repeated _dummy_ /subject=subject_id group=groups;
lsmeans groups;
estimate 'CONO-CONY' groups 1 -1 0 0;
estimate 'HSO-HSY ' groups 0 0 1 -1;
estimate 'CONO-HSO ' groups 1 0 -1 0;
estimate 'CONY-HSY ' groups 0 1 0 -1;
estimate 'interaction' groups 1 -1 -1 1;
run;
Above, the treatment variable is called "groups", and the four groups are:
CONO: Control rats, old age,
CONY: Control rats, young age,
HSO: Treated rats, old age,
HSY: Treated rats, young age.
As you can see, "groups" is really a concatenation of two factors, treatment and age, which is why I was interested in estimating their interaction.
Subject_ID is the subject ID, of course, and _dummy_ is the dummy repeated-measures variable.
------------------------------
Eric Siegel, MS
Research Associate
Department of Biostatistics
Univ. Arkansas Medical Sciences
Original Message:
Sent: 02-25-2016 10:09
From: Ralph O'Brien
Subject: Variance heterogeneity in Anova
Because the subject line here is "Variance heterogeneity in ANOVA," I assume your research question has led you to compare J means. And it seems you want to protect all J*(J-1)/2 pairwise comparisons under a single familywise alpha level.
Let's consider comparing groups 1 and 3, that is, assessing (ybar1 - ybar3).
If you believed that all J variances are equal, then you could use the standard two-sample t statistic, but substitute the pooled estimate of the common variance (DFE = N-J) to estimate SE(ybar1 - ybar3), where SE is "standard error." This is just the contrast [1 0 -1 0 ...0]*mu, where mu is the vector of J means.
But when the variances are not equal, the pooled SE(ybar1 - ybar3) will be biased up or down depending on the how the J sample sizes pair up with the J true variances. It's a long story, but we don't need to cover it.
Fortunately, you can assess (ybar1 - ybar3) quite well with the common Welch t test, which makes no HoV assumption and only uses the variances from groups 1 and 3 to properly estimaate SE(ybar1 - ybar3). In R's t.test(), var.equal = FALSE is the default. In my teaching, I've done many Monte Carlo studies on this test and it works surprisingly well even under vivid non-Normality (even with Y ~ Bernoulli) and n's surprisingly small. (No I will NOT give a number). The Central Limit Theorem *usually* rescues us much sooner than most people believe. But that's another long story.
Switching to the Wilcox-Mann-Whitney test changes the research question! Instead you would be assessing theta = Prob[Y1 < Y3] + P[Y1 = Y3]/2. In spite of what you may read in many texts, the WMW test does NOT compare medians (or any other quantile) unless you are willing to assume the so-called "location shift" model, which is rarely tenable in practice, and in this case (unequal group variances), completely untenable. Whenever this condition supports using the WMW to compare medians (and get a Hodges-Lehmann CI), it also supports using the WMW to compare MEANS, but we never see anyone advocating using the MWM test to compare means. Another long story.
Whether to assess all J*(J-1)/2 Welch t statistics using the Bonferroni correction is a separate issue. I taught that 35 years ago, but have long considered this to be a wishy-washy practice. Instead I create tight, custom statistical hypotheses to address the research hypotheses (hardly a novel idea), and take each one to be its own family. If a statistical hypothesis requires 2 contrasts to cover the research hypothesis, then I might well use alpha.family/2. But most of the time, I can address a research question tightly by forming a single contrast, so no Bonferroni adjustment is needed. This leads to a specific relationship between each research question and each analysis. It also can greatly increase the statistical power. For J=4, using 0.05/6 = 0.0083 for each test (or using 99.2% CIs) just increases the Type II error rate (or the CI coverage rate) way too much in most cases. Of course, all this should be done when the study is planned, thus forcing the content researchers and the statisticians to collaborate when it will do the most good. Another long story.
Of course, the Bayesians were just chuckling as they read over all this (archaic) frequentist logic--if they even bothered to do so. And they are correct. Even a longer story.
------------------------------
Ralph O'Brien
Professor of Biostatistics (officially retired; still keenly active)
Case Western Reserve University