ASA Connect

 View Only
  • 1.  Idea for average effect plots in a linear regression model

    Posted 07-28-2020 19:18
    Hi everyone,

    When fitting linear regression models, we can visualize the effects produced by the model using effect plots.  These plots essentially show how the predicted values of the response variable Y change with the values of a predictor variable Xj, while holding the values of all other predictor variables (i.e., the non-focal predictor variables) in the model fixed at some conveniently chosen values. 

    In practice, if the model has many predictor variable, can we use a variation of these effect plots? The variation I had in mind involves considering a specific value for the focal predictor Xj but allowing each of the non-focal predictors to take all values observed in the data. Once the predicted values of Y are computed for that specific value of Xj, we can average them across all values considered for the non-focal predictors and report the average predicted value of Y and a 95% uncertainty interval obtained via bootstrapping.
    Would something like this make any sense?   The idea is that we would not have to choose "typical" values for the non-focal predictors, but rather consider all of the values of these predictors observed in the data to give a more complete picture of what is going on

    Have other people already tried something like this?  If it makes any sense, could we describe this type of plots like "average effect plots" or something like that?

    Here is a quick R example (minus the bootstrapping) of what I have in mind.  In this example, the linear model (lm) relates miles per gallon (mpg) to weight (wt) and number of cylinders (cyl) for a sample of 32 car models.  To construct the "average effect plot" for the cyl predictor, I first get the fitted values from the model.  Then, I separate those into fitted values corresponding to cyl = 4 and compute their average.  The fitted values corresponding to cyl = 6 are also averaged, etc.  The resulting average fitted values are:

    cyl = 4:  26.6

    cyl = 6: 19.7

    cyl = 8: 15.1

    These three values would be plotted against the cyl values to obtain the "average effect plot".  Uncertainty bands can also be added to the plot for each average predicted value displayed.

    data(mtcars)

    mtcars$cyl <- factor(mtcars$cyl)

    m <- lm(mpg ~ wt + cyl, data = mtcars)

    cylAvgPred <- function(m){

    m$pred <- predict(m)


    pred.cyl.4 <- m$pred[mtcars$cyl %in% "4"]

    mean.pred.cyl.4 <- mean(pred.cyl.4)


    pred.cyl.6 <- m$pred[mtcars$cyl %in% "6"]

    mean.pred.cyl.6 <- mean(pred.cyl.6)


    pred.cyl.8 <- m$pred[mtcars$cyl %in% "8"]

    mean.pred.cyl.8 <- mean(pred.cyl.8)


    list(mean.pred.cyl.4 = mean.pred.cyl.4,

    mean.pred.cyl.6 = mean.pred.cyl.6,

    mean.pred.cyl.8 = mean.pred.cyl.8)

    }

    cylAvgPred(m)


    Many thanks,

    Isabella

    ------------------------------
    Isabella R. Ghement, Ph.D.
    Ghement Statistical Consulting Company Ltd.
    ------------------------------


  • 2.  RE: Idea for average effect plots in a linear regression model

    Posted 07-29-2020 08:49
    Hello Isabella,

    I have done this and what you are describing is typically referred to as partial or marginals effects. I recommend the margins package:

    https://cran.r-project.org/web/packages/margins/index.html

    Robert


    ------------------------------
    Robert O'Brien
    ------------------------------



  • 3.  RE: Idea for average effect plots in a linear regression model

    Posted 07-29-2020 10:00
    Hi Isabella,

    Partial Dependence Plots are similar to your idea and are popular in machine learning.  They have some drawbacks, and related graphs are Individual Conditional Expectation plots and Accumulated Local Effects plots. These methods are implemented in several R packages (e.g. DALEX).

    PDPs are discussed here: https://christophm.github.io/interpretable-ml-book/pdp.html and the subsequent sections discuss ICE and ALE plots

    Kind regards,
    Stan

    ------------------------------
    Stanley E. Lazic, PhD
    https://stanlazic.github.io/
    ------------------------------