Discussion: View Thread

Using information criteria (AIC, AICc, etc) for random and mixed effects models

  • 1.  Using information criteria (AIC, AICc, etc) for random and mixed effects models

    Posted 06-09-2014 12:02
    This message has been cross posted to the following eGroups: Statistical Consulting Section and Young Professionals Group .
    -------------------------------------------

    Hello all:

    I would like to obtain some help/feedback from the statisticians who are aware that the usage of information criteria becomes far less straightforward when one switches from a purely fixed effect model to a random/mixed model.

    To clarify, I am referring to Burnham and Anderson, "Model Selection and Multimodel Inference", 2nd edition, Section 6.6. I'll provide a simple example to illustrate the point. Consider a 1-way ANOVA with K treatments, which in the fixed effects form amounts to an intercept and (K - 1) main effect parameters:

    Y_ij = mu + tau_i + eps_ij

    Suppose we are to test this model vs an intercept only model. It is clear that the difference in the number of parameters between the full and reduced models is (K - 1).

    Now, if we call the treatment factor random, then tau_i are assumed to be iid from N(0, sigma_mu). The intercept only model now corresponds to the restriction sigma_mu = 0, so the difference in the number of parameters between the full and reduced models is 1.

    In reality, the effective number of parameters in the random effects model is GREATER than it appears to be. That is because the random tau_i are assumed to have come from a Normal distribution, which amounts to some extra implicit parameters on top of one explicit parameter sigma_mu. Therefore, the difference in the number of effective parameters between the full random model and the intercept only model is somewhere in between (1 ; K - 1). The number of effective parameters is not easy to compute in practice, which has a few important and somewhat funny implications:

    1) It is perfectly possible to compute the likelihood function, but it is useless if the goal is to perform the above mentioned test using AIC and similar information criteria;

    2) Likewise, the likelihood function is useless for the LR test because the number of degrees of freedom in the LR Chi-squared statistic is unknown.

    3) If we are to compare two mixed models that differ only wrt fixed effects, then the LR test and information criteria can be applied "as usual" (for LR, we should consider only nested models).

    The presence of implicit parameters in the random/mixed models does not appear to be widely recognized. In particular, I was looking up the difference between MLE and REML and came across this:

    http://users.stat.umn.edu/~corbett/classes/5303/REML.pdf

    In Section 8, the author suggests it's fine to use AIC/BIC in a usual manner for both random and mixed parameters as soon as the full MLE (as opposed to REML) estimation is performed.

    B&A provide a rather narrow example when the number of effective parameters can be computed for a mixed model (p 315), and, as a result, it is possible to apply AIC/AICc for model selection. However, they emphasize that a general solution is elusive, although some options (DIC, h-likelihood) are hopeful.

    If you have come across a similar problem and found a theoretically sound solution that has been implemented in some software, please share your experience.

    Regards,

    Nik