ASA Connect

 View Only
Expand all | Collapse all

mixed model

  • 1.  mixed model

    Posted 03-25-2019 09:45
    Dear all, can someone help me with a simple clarification on what random effect and fixed effect are in a mixed model. I have been appraoched many times for explanation on that. Definitions i got online couldnt help at all. A simple explanation with example will help a great deal.
    Best,



    ------------------------------
    Ikenna Nnabue
    Research Officer
    National Root Crops Research Institute.
    ------------------------------


  • 2.  RE: mixed model

    Posted 03-26-2019 04:39
    Dear Ikenna, several simple answers did not satisfy A Gelman (doi:10.1214/009053604000001048 p.20ff) and I Kreft & J de Leeuw (http://gifi.stat.ucla.edu/janspubs/1998/books/kreft_deleeuw_B_98.pdf p.10f). So I conjecture, such criteria may apply conditional on context: discussion of models and estimation techniques or of applications. In a wine tasting we may test the wines or the tasters.

    ------------------------------
    Reinhard Vonthein
    Universitaet Zu Luebeck
    ------------------------------



  • 3.  RE: mixed model

    Posted 03-26-2019 08:26

    As my initial rule of thumb when I'm modeling data that have fixed and random effects, I ask myself, "if I or someone else were repeating this experiment (or doing it at the same time), what effect would be the same for all of us and what would be different?" So a treatment (fertilizer, seed, etc) that we are testing would be the same across experiments and thus would be fixed. Things that cause variation but are different across experiments (plots in fields, benches in greenhouses, etc) are typically considered random. They are a "sample" from the population of plots that could have been used. 

    I have 2 books I recommend when it comes to mixed model references. "Analysis of Messy Data" by Milliken & Johnson and "SAS for Mixed Models" which, even if you do not use SAS, does a good job of explaining the theory/ideas of mixed models which is necessary regardless of what software package you use.



    ------------------------------
    Elizabeth Claassen
    ------------------------------



  • 4.  RE: mixed model

    Posted 03-27-2019 11:54
    You may refer to "Variance Components Estimation" by Poduri S.R.S. Rao (1997), Chapman & Hall . it has the methodologies and applications of the fixed and random effects models with illustrations and exercises.

    ------------------------------
    [S.R.S. Rao] [Poduri][Professor of Statistics][University of Rochester][Rochester, NY 14627]
    ------------------------------



  • 5.  RE: mixed model

    Posted 03-26-2019 09:04
    Hope this helps.

    In 1960, Green and Tukey wrote "When a sample exhausts the population, the corresponding variable is fixed; when the sample is a small (i.e., negligible) part of the population the corresponding variable is random."

    If we have an experiment with a random sample of washing machines of 3 different brands and the outcome is reduction in dirt, we are interested in the effect of the brand of the washing machine, not the individual washing machine.
    The brand of washing machine is a fixed effect and the effect of the individual washing machine is a random effect.

    Consider clinic. If the experiment were repeated, would you choose the same clinics or is the individual clinic site just a random sample of many clinics? If the clinic site would be chosen again, then clinic site is fixed. On the other hand, if we select a random sample of clinics from a huge list, then clinic site would be a random effect.

    ------------------------------
    Brandy Sinco, BS, MA, MS
    Statistician and Programmer/Analyst
    ------------------------------



  • 6.  RE: mixed model

    Posted 03-27-2019 08:47
    In the classic old ANOVA context, a fixed factor was one that fully represents the range of variation of that factor. This could be either because it literally includes all the categories of interest (like male, female, and transgender // or drug A, drug B, and placebo) or it adequately represents the range in a numeric case (like choosing subjects who were 3-4 years old, 6-7 years old and 9-10 years old, leaving a few gaps to increase differences but still covering the age range).

    Random factors are ones that can randomly vary from study to study, and therefore part of the analysis is trying to figure out how much of the observed effects may be due to that very random process. So, comparing surgical to medical sections on patient satisfaction in 3 different hospitals. If the real comparison is between medical and surgical, with these 3 hospitals just a random choice, you could have picked 3 with a very good surgical department. Or 3 with a very poor surgical department. Estimating the effect of department becomes much messier when the other factor (hospital) is random.

    If you only want to compare three hospitals, just to show that "hospitals differ" it actually doesn't matter if hospital is fixed or random. The math, even in standard ANOVA, works out to the same p value.

    It's the OTHER factor that is heavily affected.

    Suppose I have 10 hospitals in an area. That's all of them. I take a sample, say 100 patients, from each in the medical and surgical wings to assess satisfaction. Because hospital is fixed here (I used all of them in the region I care about), the analysis in effect gives me 1,000 surgical patients and 1,000 medical patients.

    But what if I had randomly selected 3 hospitals? Classic ANOVA would more or less give you an N of 3! Why? because you can't be sure the three you picked represent the full range -- you may have three that have some characteristic (like especially good surgical wing). Therefore the analysis says, "You have good evidence that surgery is better... in these 3 hospitals... but the p value to generalize to all 10 is based on a paired t-test in which the mean of each department hospital is the unit of analysis." N = 3.

    Modern mixed models analysis (maximum likelihood and such) are more efficient at this, but I don't think they work all that much better when the random factor has very few levels.

    ------------------------------
    Edward Gracely
    Drexel University
    ------------------------------



  • 7.  RE: mixed model

    Posted 03-28-2019 15:14
    I like this paper, especially for researchers in biology and ecology.

    Bennington, C.C. and W.V. Thayne. 1994. Use and Misuse of Mixed Model Analysis of Variance in Ecological Studies. Ecology 75: 717-722.


    ------------------------------
    Susan Durham
    Utah State University
    ------------------------------



  • 8.  RE: mixed model

    Posted 04-04-2019 14:41

    Hi Ikenna,

    I teach a workshop on Mixed Models that I've taught about 15 times to close to a thousand researchers and I agree that this is one of those topics that is most difficult for learners of mixed models to wrap their heads around. It's just not intuitive.

    After years of explaining it, I think I've finally managed an explanation that researchers get. What I have found is people get confused about the difference between a control variable and a random factor. The key insight that people are missing is the idea of a design or blocking variable and how that differs from a covariate (where covariate can be categorical or continuous, aka control varible). Covariates can only be fixed because we're controlling for the *specific* values of that variable. Only blocking variables can be either fixed or random because sometimes we care about controlling for those specific values of that variable and sometimes they're interchangeable.

    If you want to see the full explanation, I put together a free webinar on it: https://thecraftofstatisticalanalysis.com/fixed-random-factors/. It's about 45 minutes and yes, it takes that long to really explain it. :)

    I've also found that one of the things we statisticians do inadvertently that really confuses people is we use the terms "random effect" and "random factor" as if they're the same thing. (And yes, I have caught myself doing it).

    For example, you'll see statements like "hospital was a random effect in the model because patients were nested within hospitals." This makes it hard for people to really understand what a random effect (the random intercept and/or the slope for that hospital) is and what the random factor (the hospital) is. Here's a blog post I wrote about that: https://www.theanalysisfactor.com/random-factors-random-effects-difference/



    ------------------------------
    Karen Grace-Martin
    Senior Statistical Consultant
    The Analysis Factor, LLC
    ------------------------------



  • 9.  RE: mixed model

    Posted 04-08-2019 21:30
    It would help to avoid misunderstanding if covariates were not referred to as "control variables." By their nature, they are only observed; they are not under control in the usual sense that the factors in a design are. Thus, it is more accurate to say that the analysis "adjusts for" the contributions of covariates.

    And, since a covariate (especially a continuous covariate) has whatever values happen to be observed, the meaning of "controlling for the 'specific' values of that variable" is not clear to me.

    ------------------------------
    David Hoaglin
    ------------------------------



  • 10.  RE: mixed model

    Posted 04-09-2019 09:23
    Sigh. I have worked with data analysts in many, many scientific fields for 20 years. I have found vocabulary, especially for predictor variables, to be one of the biggest challenges.

    Every field, and even specific voices within fields, has different definitions for specific terms that fit the kinds of research they do. Or even worse, different terms for the same definition.

    This alone causes so much confusion among people who are trying to learn to do good statistics. Because they read books or take classes or workshops by someone who is using slightly different definitions without mentioning it, and they get really confused. Especially because so many of these definitions have very slight differences so it's not immediately clear that it's different.

    I recently did a workshop in which I described some of the different terms and different definitions used for predictor variables (and by predictor variables here I mean any X, regardless of its role or level or measurement). One of the terms was "Independent Variable." I immediately had two different participants say "wait - doesn't the term Independent Variable imply" [a certain definition]. And they both gave completely opposite definitions.

    So since there is no consensus on the specific definitions of any of these terms, I always suggest people do what I did in my response and as you did with "control variables": define the terms as they're using them. Beyond that, I don't worry too much, as long as people are getting the concepts. I've never heard the concern that controlling for a variable implies you've controlled the values of that variable, but sure, adjusts works.

    Anyway, I don't know if you watched the full webinar, but I suspect my point about 'specific values" will make more sense there. It's really only about the categorical variables (which I call factors). Sometimes the summary just isn't clear, as it seems was true here.

    ------------------------------
    Karen Grace-Martin
    Senior Statistical Consultant
    The Analysis Factor, LLC
    ------------------------------



  • 11.  RE: mixed model

    Posted 04-10-2019 13:20
    Thank you to Karen & David for calling much needed attention to confusing statistical terminology. Fortunately, I have been seeing signs of a growing awareness that we as a field need to deal with this problem.

    In my classes, I make a point of tackling terminology issues head on.  For example, "independent" and "independence" are terms that are used in a variety of contexts and cause a great deal of confusion.  I would like to purge "independent"  from the terminology for defining the role a variable plays in an analysis! Clearly, in regression analysis (mixed or not), it is common for X variables to be correlated (and thus not independent at all).  Calling the X variables "explanatory variables" or "predictor variables" helps students see that the researcher is defining a role for the variables, rather than the variables inherently having a characteristic. 

    I just became aware of another problematic use of the term "independence". Textbooks commonly call a two-variable chi-square test a "chi-square test of independence."  And then when discussing the assumptions/conditions to evaluate when using the test, "independent observations" is listed.  Understanding the difference between  "independent observations" and "values on two variables being independent of each other" is very difficult for many students!

    I would like to encourage educators to call a two-variable chi-square test "a chi-square test of association." We are interested in whether or not two variables are associated and the null hypothesis is "no association."  

    Thanks!

     


    ------------------------------
    Sheila Barron
    Statistician Manager
    University of Iowa
    ------------------------------



  • 12.  RE: mixed model

    Posted 04-15-2019 10:56
    Great answer, Sheila! One thing I have pondered for myself for a while now is this:

    In practice, there are studies which use models aiming at both causal explanation and prediction. For those studies, it would seem not to matter whether we call the X variables "explanatory variables" or "predictor variables", since those studies have a dual purpose.  For studies focusing solely on causal explanation or solely on prediction, the terminology to be used for the X variables would be straightforward, though one would have to still be mindful of the purpose of the study in order to use the correct terminology. 

    Galit Schmueli classified the purpose of a statistical model as: casual explanation, empirical prediction and description (https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf).  What would the correct terminology for X variables in a descriptive study be? Descriptive variables? 



    ------------------------------
    Isabella Ghement
    Ghement Statistical Consulting Company Ltd.
    ------------------------------



  • 13.  RE: mixed model

    Posted 04-09-2019 17:03
    David Hoaglin is too polite to say calling "covariates" controlling variables "not clear".  In my mind, it is inappropriate to call them controlling variables, discrete or continuous.  As the name clearly implies, covariates are "covariables" and should be designated as such whether in a fixed, random, or mixed model. In any model, covariates are used to adjust but not to "control" for extraneous variation affecting the variates in question.  Classic example is adjustment of post-treatment body weight of subjects using the pre-treatment body weight as a covariate.

    Ajit K. Thakur, Ph.D.
    Retired Statistician

    ------------------------------
    Ajit K. Thakur, Ph.D.
    Retired Statistician
    ------------------------------



  • 14.  RE: mixed model

    Posted 04-10-2019 05:33
    Interesting Ajit's classic example - "adjustment of post-treatment body weight of subjects using the pre-treatment body weight as a covariate" is routinely referred to "baseline-controlled" in the lab by non-statisticians. I can definitely understand how this leads to confusion.

    ------------------------------
    Jonathan Stallings
    CEO
    Data InDeed
    ------------------------------



  • 15.  RE: mixed model

    Posted 04-10-2019 13:26
    This discussion leads me to make the following "Swifitian" modest proposal:

    1. The ASA should have a contest to find those terms frequently used in the probability and statistics literature that are the most confusing to non-statisticians. Some suitable award should be given to the winners.

    My personal favorite, speaking as a statistical tourist, is Random Variable. This term bothers me because the text book definition says that a Random Variable is a function. I always wanted to shout "wait isn't a variable an argument of a function and not itself a function?" And then there is that little modifier "Random"  that also made me want to shout say "wait all the examples of Random Variables in the text books were in fact functions that, like all good functions assigned a very specific  unique value in the codomain, for every parameter (argument of the function) in the domain". So where was the randomness in that? And when no less than a founder of the field, Kolmogorov, despaired of finding a suitable definition for randomness and now McElreath in his lectures  suggests kicking the term "off the island" altogether, well I feel like less of an idiot. But only slightly less.

    2. On a more serious note maybe a compilation of confusing terms with alternative interpretations might actually be helpful to non-statisticians.

    Mike Elmaleh






    ------------------------------
    Michael Sack Elmaleh
    Principal
    Michael Sack Elmaleh CPA, CVA
    ------------------------------



  • 16.  RE: mixed model

    Posted 04-14-2019 00:56
    Mike Elmaleh wrote:
    "On a more serious note maybe a compilation of confusing terms with alternative interpretations might actually be helpful to non-statisticians."

    I agree. You can find some examples linked from https://www.ma.utexas.edu/users/mks/statmistakes/TOC.html

    Martha Smith)


    ------------------------------
    Martha Smith
    University of Texas
    ------------------------------



  • 17.  RE: mixed model

    Posted 04-14-2019 01:05
    clarity of terms is essential for communication. The link mentioned ny Martha did not open for me. However I can add to references:
    https://www.nature.com/articles/nmeth.3489?proof=trueIn
    https://www.wiley.com/go/kenett-redman/datascience

    ron

    ------------------------------
    Ron Kenett
    Professor
    The KPA Group; SNI, Technion and IDR, Hebrew Univ., Israel
    ------------------------------



  • 18.  RE: mixed model

    Posted 04-11-2019 05:31
    The point is that this discussion combines conceptual and technical aspects.

    If you have lots of options for possible effects (e.g. 3000 genes) you might want to use a random effect model as a surrogate to lots of fixed effects. You would do this on technical grounds.

    Conceptually, the overarching concept is that of findings generalisation. What is your analysis aiming at. The various discussions in this thread can be classified according to this. One should therefore make explicit the generalisation scope, and that will clarify the fixed effect/random effect issue. Covariates are indeed observed. If you want to generalise findings a la Neyman/Rubin using potential outcomes, the covariates can be looked at using causality arguments. Another approach is to consider graphical models a la Pearl. Yet another direction is the epidemiological considerations behind Cornfield's inequality.

    For a discussion on generalization of research claims in the context of the reproducibility debate see https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3035070


    ------------------------------
    Ron Kenett
    Professor
    The KPA Group; SNI, Technion and IDR, Hebrew Univ., Israel
    ------------------------------



  • 19.  RE: mixed model

    Posted 04-11-2019 13:36
    The Gelman & Hill textbook on multilevel / hierarchical models suggests abandoning the abandoning the terms "fixed" and "random" entirely due to the confusion around these terms.  While I'm not sure I'd be willing to go that far, I do support their suggestion to focus on their description of the model.

    For example, we can write a random intercept mixed effects model with one covariate x, where i specifies an observation and j specifies a group to which the observation belongs, as:
    y_ij = x_i * β + α_j + ε_i
    α_j ∼ Normal(0, γ²)
    ε_i ∼ Normal(0, σ²)
     
    The intercept is "random" here because we're assuming some relationship (specifically a Gaussian above, although in practice this can anything) between the intercept for each observation (at least I have defined the model... you could have intercepts that instead vary by groups to which an observation can belong).    

    If we think in these terms, the "fixed" effect is "fixed" because it is the same for all observations.  The "random" effect is "random" because the coefficients attached to the "random" factor (per Karen's comment) are drawn from a probability distribution.  That probability distribution includes the effects of groups not in the data, which covers the concept of "random" effects covering data where the observed levels are samples of many possible levels.

    Of course, focusing on the "model" may be confusing to a non-technical audience, so in those cases I would simplify to 
    1. Fixed effect - we assume no relationship between different levels of a factor
    2. Random effect - we assume there is a relationship (e.g. Gaussian) between different levels of a factor


    ------------------------------
    Dave Klemish
    Ph.D. Candidate
    Duke University
    ------------------------------



  • 20.  RE: mixed model

    Posted 04-12-2019 07:46
    First of all this is not true.  Fixed effects can be nested in fixed or random effects.   This is common in industrial, agricultural and in-vitro biological experiments. 

    The hierarchical models that Gelman advocates are good for progressively hierarchical cases such as regions within states or classes within schools, etc.  Then there is a bigger problem of reducing everything to slopes from (0,1) lines for categories.  It is mathematically true but not useful to clients with interest in the mean of each level of a category.

    On a practical note a few years ago I did a random intercept model in order to explore the impact of a pre-treatment for a clinical project.  The variability of the patients was intriguing to me but of little interest to the clinical or product team.  


    ------------------------------
    Georgette Asherman
    ------------------------------



  • 21.  RE: mixed model

    Posted 04-12-2019 11:49
    Sorry, I definitely did not mean to imply that fixed effects in a model are necessarily exclusive to the random effects in a model, if that's what you meant by fixed effects being nested in random effects.  I agree that a fixed effect is often included for any factor for which you are also considering a random effect, because there is often interest in the population mean effect, exactly as you mentioned.  In this case, the fixed effect measures the overall effect of a factor, where as the random effect measures the variation of the factor effect across sub-groups.  

    For example, in a random intercept model you absolutely would include a global intercept term if your model for the random intercepts was Normal(0, sigma^2).

    I apologize that my example model didn't do this.  I hope I didn't introduce more confusion to an already confusing topic!  My main point was that the mathematics of a model makes explicit what is "fixed" and what is "random".  That probably won't be a useful description to a non-stat audience, so instead a description that mentions the implications on estimation may be more meaningful:
    • Random effect: We assume a relationship (i.e. distribution) between the effects of different levels for this factor.  Estimation of these effects are shrunk to the overall mean using partial pooling of data across levels.
    • Fixed effect: We assume no relationship (i.e. distribution) between the effects of differing levels for this factor. There is no shrinkage of the estimated effects as a result.
    As Ron pointed out, if you have many levels for a factor you can treat that factor as "random" even if you've observed all of the levels (so it's not a sample) because of the desire for shrinkage of effect estimation.  To me, the idea of shrinkage / partial pooling is the key to whether an effect should be considered random vs. fixed, not necessarily whether the levels of that factor are completely observed or are sampled.

    ------------------------------
    David Klemish
    Ph.D. Candidate
    Duke University
    ------------------------------



  • 22.  RE: mixed model

    Posted 04-15-2019 11:40
    Georgette   said,
    "On a practical note a few years ago I did a random intercept model in order to explore the impact of a pre-treatment for a clinical project.  The variability of the patients was intriguing to me but of little interest to the clinical or product team."

    It is so sad, from the perspective of health care patients, that the variability of patients was of little interest to the clinical or product team. Between-patient variability can be an important factor in treatment effectiveness. It seems unethical to ignore it.


    ------------------------------
    Martha Smith
    University of Texas
    ------------------------------



  • 23.  RE: mixed model

    Posted 04-14-2019 00:35
    I agree heartily with what David Hoaglin has said,

    ------------------------------
    Martha Smith
    University of Texas
    ------------------------------



  • 24.  RE: mixed model

    Posted 04-16-2019 11:28
    Edited by John Major 04-16-2019 12:05
    If I understand correctly, the key distinction is whether the effect is estimated as part of the error.  Fixed effects are estimated to minimize the residual error.  Random effects are then analyzed as components of that residual error. Let's say you have three widely separated blocks of 5 plots each.  The plots are then further subdivided into 4 sub-plots, each receiving a different treatment (say, amount of fertilizer) and the response is yield of the crop. 

    (1) You could take the fertilizer amount as a fixed effect, ignore the plots and blocks, and assign all residual error to a single error term.  (2) You could take each of the 15 plots as a fixed effect, giving each its own intercept.  (3) You CANNOT, from model 2, additionally make the blocks a fixed effect, because then your model is not identifiable - each plot maps to a single block, so there is no way of distinguishing block and plot fixed effects. (4) You CAN, from model 2, additionally make the blocks a random effect; the residuals from model 2 are apportioned between the blocks and an independent residual error. (5) You could make both the blocks and the plots random effects, but here my understanding skates to the edge; I don't know what the consequences of that would be for the estimates of the fertilizer effect as compared to model 4.

    How to do it is pretty complicated, and Douglas Bates explains it all in a wonderful series of pdf documents and R code at http://lme4.r-forge.r-project.org/slides/2011-01-11-Madison/

    EDIT: I got example 4 backwards.  You could make the blocks fixed effects but the plots would have to be random effects.  This would give valid estimates for the block fixed effects.  As written above, the block random effects would be zero because the plot fixed effects would leave no measurable error at the block level. Told you it was complicated. ;-) 
    Why would you want to do that? I hear you ask.  In my business, block = type of insurance company, plot = insurance company, amount of fertilizer = financial characteristics at various points in time.

    ------------------------------
    John Major
    Guy Carpenter & Co., LLC
    ------------------------------



  • 25.  RE: mixed model

    Posted 04-17-2019 12:04
    This might help, it might not. But, here it goes.

    Suppose that I am doing an experiment and I have 3 technicians measuring the diameter of a set of metal or wood rods with a single caliper. If my company only has 3 technicians and 1 set of rods, then these would be "fixed effects". If my company has many technicians and many sets of rods, they would be random effects.  

    There was also some discussions on how if the value can fluctuate randomly, as in you have no or little control over it, it's a random effect. If you can adjust the value as needed, it's a fixed effect. 

    If I design an experiment and look at 5 factors and break that into 4 blocks, and measure other variables (covariates) my blocks and covariates are random effects. My 5 factors are fixed effects.

    ------------------------------
    Andrew Ekstrom

    Statistician, Chemist, HPC Abuser;-)
    ------------------------------



  • 26.  RE: mixed model

    Posted 04-17-2019 15:58
    I keep coming back to this thread, which is really interesting.  As a consultant, I ask myself:  How can I explain a complex concept to a client using a quick example if I only had a limited amount of time to do it?  

    The easiest explanation that I can think of is that fixed effects focus on what is "typical" while random effects focus on "deviations from what is typical".  

    First, I would give a quick example to explain what "typical" is.  For example, imagine we select 30 students at random from a single school (all in grade 5) and record their grade on the same math test.  A "typical" grade for these students would consist of the average of the 30 students' grade and would estimate the unknown "typical" grade of all grade-5 students in that school.  The "typical" grade is specific to this school.  

    Then, I would expand on this example to explain what "deviation from what is typical" means.  In the previous example, imagine we randomly select 10 schools in the same district and then, within each school, we randomly select 30 students who will take the same math test (all in grade 5).  Obviously, each of the 10 schools will have a "typical" grade.  School A might have a "typical" grade of 75 (out of 100),   School B might have a "typical" grade of 80 (out of 100).  School C might have a "typical" grade of 85 out of 100, etc.

    For each grade, the "typical" grade can be conceived of in relation to the "typical" grade corresponding to a "typical" school.  For example, School A's "typical" grade might be lower than the "typical" grade of the "typical" school.  School B's "typical" grade might be higher than the "typical" grade of the "typical" school, etc.  School C's "typical" grade might be exactly equal to the "typical" grade of the "typical" school, etc.  

    We need a school-specific random effect to keep track of where the school-specific "typical" grade is in relation to the "typical" grade of the "typical" school.  For example, a negative school-specific random effect associated with School A would indicate that School A's "typical" grade is lower than the "typical" grade of the "typical" school.  On the other hand, a positive school-specific random effect associated with School B would indicate that School B's "typical" grade is higher than the "typical" grade of the "typical" school.  A zero school-specific random effect associated with School C would indicate that School C's "typical" grade is exactly equal to the "typical" grade of the "typical" school, etc. 

    While the sign of the school-specific random effect tells us whether a school-specific "typical" grade is smaller than (negative sign), equal to, or larger than (positive sign) the "typical" grade of the "typical" school, the magnitude of the school-specific random effect tells us the extent of the deviation of the school-specific "typical" grade from the "typical" grade of the "typical" school.  The larger the magnitude of the school-specific random effect, the larger this deviation.  Conversely, the smaller the magnitude of the school-specific random effect, the smaller this deviation.  A school-specific random effect equal to 0 implies no deviation. 

    Before we fit a mixed effects models to data such as the one described above for the 10 randomly selected schools, we won't know what the "typical" grade in the "typical" school is and to what extent the school-specific "typical" grades deviate from it.  Using the model, we can estimate the "typical" grade in the "typical" school and quantify the magnitude and sign of the deviations of the school-specific "typical" grades from it.  This will allow us to also estimate the school-specific "typical" grades.

    One way I think about random effects is like a "sponge" term which captures all influences that may affect how the school-specific "typical" grade deviates from the "typical" grade in the "typical" school.  Examples of such influences may include school size, school type (e.g., private, public), etc. These influences play out in different ways for each school, which is why we see School A behaving differently from all other schools, School B behaving differently from all other schools, School C behaving differently from all other schools, etc.  Of course, if two schools are subject to the same influences which interplay in the same way, their corresponding random effects will be the same.  

    Note that, in the above example, the random school-specific effect is a so-called random intercept effect. 

    Anyway, the example given above can be further expanded if one wishes to.  


    ------------------------------
    Isabella Ghement
    Ghement Statistical Consulting Company Ltd.
    ------------------------------