ASA Connect

 View Only
  • 1.  Logistic Regression and Discriminant Analysis by OLS

    Posted 04-03-2016 18:09
      |   view attached

    Dear colleagues, I recently dug up an interesting 1983 paper by Gus. W Haggstrom, entitled "Logistic Regression and Discriminant Analysis by Ordinary Least Squares" (URL: http://www.jstor.org/stable/1391344). In the paper, it is claimed that one can obtain the MLE coefficients for a dichotomous logistic regression from an "intermediate least squares" (ILS) regression model (Equation 2.1 in the paper):


    y_i = alpha + Beta' * x_i + e_i


    where y_i is an indicator variable of 0's and 1's, and the ILS regression coefficients are fitted using Ordinary Least Squares. The MLE coefficients can be obtained from the "ILS" coefficients, it is claimed, using fairly simple relations; see Equation 2.3 in the paper.

    Intrigued, I tried to confirm the result in R, but to no avail. Attached is a plain text file containing R code, in which I unsuccessfully try out Theorem 1 of the paper (Equation 2.3), as well as a hand-computation of the MLE coefficients (Equation 1.13), using a data set that comes pre-packaged with R ("mtcars"). Where are my errors?

    Thanks very much!

    Joe

    ------------------------------
    Jose Maisog
    Senior Informatics Scientist
    Blue Health Intelligence
    ------------------------------


  • 2.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-04-2016 10:08

    Joe,

    you (probably) didn't do anything wrong with the first two parts of your R-code. The crucial thing is that there are different MLEs involved here, and that "logistic regression" in Haggstrom's paper is not identical to today's textbook logistic regression.

    Note that in Haggstrom's paper, a distribution assumption is made for X given class membership (multivariate normal distribution, equal covariance matrices in each class). You could view this as a joint model for Y and X; the likelihood function is then given by the joint probability of the observed Y and X, and maximization of this likelihood function results in the formulas for the MLE in Haggstrom's paper. (Note that these are closed form solutions.)

    What is usually done in logistic regression as we know it (and what you did by coding R's glm routine), is not a joint modelling of Y and X, but just a modelling of Y conditional on X. The likelihood is then just the conditional probability of the observed Y given X. One could call this "conditional MLE" (and Haggstrom actually calls it that way, just have a look at section 5 in his paper). There is in general no closed form solution for this conditional MLE.

    MLE and conditional MLE result in different estimators. Haggstrom's idea to use OLS estimation to derive the ML estimator does not translate to the derivation of the conditional ML estimator (though it would be nice, as this would result in a closed form solution).

    (A Google search for "difference between logistic regression and discriminant analysis" will give you even more insight.)

    Concerning your "manual" calculation of formula (1.13): You use the sample covariance matrix as an estimator of the common covariance in each group; this is not a good estimator, though (think of an example of two groups with variance 0 in each group; if the groups means differ, the pooled variance is always larger than 0), and it's not the ML estimator.

    -Hans-

    ------------------------------
    Hans Kiesl
    Regensburg University of Applied Sciences
    Germany



  • 3.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-05-2016 08:42

    Hi Hans, thanks for the feedback. Yes, that was key -- that what Haggstrom is calling logistic regression isn't the thing we call logistic regression, in 2016! Do you think it is fair to say that Haggstrom's logistic regression is closer to what we call linear discriminant analysis today?

    ------------------------------
    Jose Maisog
    Senior Informatics Scientist
    Blue Health Intelligence



  • 4.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-06-2016 14:21

    Joe,

    I can't speak for all statisticians in 2016, but I would indeed only speak of logistic regression, when Y is modeled conditional on X (without an additional model for X), and always use the term LDA (or QDA) in the other case. Others might disagree, but most textbooks certainly agree.

    BTW, concerning your original R-Code:

    Instead of the line

    SIGMA <- cov(mtcars[,c("hp","wt")])

    you might try

    data1 <- mtcars[mtcars$am == 1,c("hp","wt")]
    data2 <- mtcars[mtcars$am == 0,c("hp","wt")]
    SIGMA <- ((n1-1)*cov(data1[,c("hp","wt")]) + (n2-1)*cov(data2[,c("hp","wt")]))/n

    This gives the ML estimate for Sigma and leads to an agreement of the last two sets of your estimates.

    -Hans-

    ------------------------------
    Hans Kiesl
    Regensburg University of Applied Sciences
    Germany



  • 5.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-04-2016 10:16

    I do not believe that ML estimates can be recovered directly from a linear regression of y on x, but it is well known that the normal linear discriminant model where x|y is multivariate normal with mean mu(y) and common covariance matrix sigma has straightforward ML estimates -- the sample means of x within each group, and pooled sample covariance matrix -- and the model implies a logistic regression of y|x. Under normality assumptions the coefficients of the logistic regression are simple functions of the discriminant analysis parameters, and ML estimates are obtained by substituting ML estimates in these expressions. This approach is both simpler and more efficient than ML for the logistic regression model that fixes X, and it arguably has better small sample properties. However, it does of course rely on normality, which is a strong assumption. Extensions where some X's are normal and some are not are possible, via the general location model. Rod Little

    ------------------------------
    Roderick Little
    University of Michigan/U.S. Census Bureau



  • 6.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-05-2016 08:47

    Thanks for your response, Rod.

    > I do not believe that ML estimates can be recovered directly from a linear regression of y on x,

    I am no theoretician, but having dug down further into this question I think that this is true. The MLE done for logistic regression today is an iterative procedure, and I don't see how it could boil down to a simple least squares regression.

    I guess I have the same question for you that I had for Hans: Do you think that what Haggstrom called "logistic regression" in 1983 might be equivalent to what we call "linear discriminant analysis" today, in 2016?

    ------------------------------
    Jose Maisog
    Senior Informatics Scientist
    Blue Health Intelligence



  • 7.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-04-2016 13:48

    I think with a sample size for 32, the asymptotics haven't kicked in yet. Here is a simulation with n=3000 showing that it works to a few decimal points of precision: https://gist.github.com/nfultz/7d2fb0f1234c16e0f5344dd4e09fc424 

    ------------------------------
    Neal Fultz
    Principal Data Scientist
    OpenMail



  • 8.  RE: Logistic Regression and Discriminant Analysis by OLS

    Posted 04-05-2016 08:49

    Thanks for the simulation code, Neal. I hadn't thought about the sample size issue!

    ------------------------------
    Jose Maisog
    Senior Informatics Scientist
    Blue Health Intelligence